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1 Introduction 

The purpose of this paper is to investigate a number of recently reported exact algorithms for the 
maximum clique problem. The actual program code used is presented and critiqued. The compu- 
tational study aims to show how implementation details, problem features and hardware platforms 
t—( influence algorithmic behaviour in those algorithms. 

The Maximum Clique Problem (MCP): A simple undirected graph G is a pair (V, E) where 
V is a set of vertices and E a set of edges. An edge {u, v} is in E if and only if {u, v} C V and 
vertex u is adjacent to vertex v. A clique is a set of vertices C C V such that every pair of vertices 
in C is adjacent in G. Clique is one of the six basic NP-complete problems given in [10] . It is posed 
as a decision problem [GT19]: given a simple undirected graph G — (V,E) and a positive integer 
k < |V| does G contain a clique of size k or more? The optimization problems is then to find the 
maximum clique, where ui(G) is the size of a maximum clique. 
O A graph can be coloured, by that we mean that any pair of adjacent vertices must be given dif- 

ferent colours. We do not use colours, we use integers to label the vertices. The minimum number 
t-H of different colours required is then the chromatic number of the graph x(G), and w(G) < x(G). 

Finding the chromatic number is NP-complete. 

Exact Algorithms for MCP: We can address the decision and optimization problems with an 
exact algorithm, such as a backtracking search |8I20I31I6I18I17I22I21I12I26I27I14|5] . Backtracking 
search incrementally constructs the set C (initially empty) by choosing a candidate vertex from 
the candidate set P (initially all of the vertices in V) and then adding it to C. Having chosen a 
vertex the candidate set is then updated, removing vertices that cannot participate in the evolving 
t-H clique. If the candidate set is empty then C is maximal (if it is a maximum we save it) and we then 

backtrack. Otherwise P is not empty and we continue our search, selecting from P and adding to 

rS There are other scenarios where we can cut off search, i.e. if what is in P is insufficient to 

unseat the champion (the largest clique found so far) search can be abandoned. That is, an upper 
bound can be computed. Graph colouring can be used to compute an upper bound during search, 
i.e. if the candidate set can be coloured with k colours then it can contain a clique no larger than k 
31 8 22 12 26 27J. There are also heuristics that can be used when selecting the candidate vertex, 
different styles of search, different algorithms to colour the graph and different orders in which to 
do this. 

Structure of the Paper: In the next section we present in Java the following algorithms: Fahle's 
Algorithm 1 0, Tomita's MCQ [25], MCR [25J and MCS [27j and San Segundo's BBMC [22] . By 
using Java and its inheritance mechanism algorithms are presented as modifications of previous 
algorithms. Three vertex orderings are presented, one being new to these algorithms. Starting with 
the basic algorithm MC we show how minor coding details can significantly impact on performance. 
Section 3 presents a chronological review of exact algorithms, starting at 1990. Section 4 is the 



2 Exact Algorithms for Maximum Clique 



computational study. The study investigates MCS and determines where its speed advantage comes 
from, measures the benefits resulting from the bit encoding of BBMC, the effectiveness of three 
vertex orderings and the potential to be had from tie-breaking within an ordering. New benchmark 
problems arc then investigated. Finally an established technique for calibrating and scaling results 
is put to the test and is shown to be unsafe. Finally we conclude. 

2 The Algorithms: MC, MCQ, MCS and BBMC 

We start by presenting the simplest algorithm [5] which I will call MC. This sets the scene. It is 
presented as a Java class, as are all the algorithms, with instance variables and methods. And we 
might say that I am abusing Java by using the class merely as a place holder for global variables 
and methods for the algorithms, and inheritance merely to over-ride methods so that I can show 
a program-delta, i.e. to present the differences between algorithms. 

Each algorithm is first described textually and then the actual implementation is given in Java. 
Sometimes a program trace is given to better expose the workings of the algorithm. It is possible 
to read this section skipping the Java descriptions, however the Java code makes it explicit how 
one algorithm differs from another and shows the details that can severely affect the performance 
of the algorithm. 

We start with MC. MC is essentially a straw man: it is elegant but too simple to be of any 
practical worth. Nevertheless, it has some interesting features. MCQ is then presented as an 
extension to MC, our first algorithm that uses a tight integration of search algorithm, search order 
and upper bound cut off. Our implementation of MCQ allows three different vertex orderings to be 
used, and one of these corresponds to MCR. The presentation of MCQ is somewhat laborious but 
this pays off when we present two variants of MCS as minor changes to MCQ. BBMC is presented 
as an extension of MCQ, but is essentially MCSa with sets implemented using bit strings. Figure 
[l] shows the hierarchical structure for the algorithms presented. 




Fig. 1. The hierarchy of algorithms. 



Appendix 1 gives a description of the execution environment (the MaxClique class) that allows us 
to read clique problems and solve them with a variety of algorithms and colour orderings. It also 
shows how to actually run our programs from the command line. 

2.1 MC 

We start with a simple algorithm, similar to Algorithm 1 in [8 j. Fahle's Algorithm 1 uses two 
sets: C the growing clique (initially empty) and P the candidate set (initially all vertices in the 
graph). C is maximal when P is empty and if |C| is a maxima it is saved, i.e. C becomes the 
champion. If |C| + \P\ is too small to unseat the champion search can be terminated. Otherwise 
search iterates over the vertices in P in turn selecting a vertex v, creating a new growing clique 
C' where C = C U {v} and a new candidate set P' as the set of vertices in P that are adjacent 
to v (i.e. P' = P n neighbour s(v)), and recursing. We will call this MC. 
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MC in Java Listing [L"T| can be compared to Algorithm 1 in |5]. The constructor, lines 14 to 22, 
takes three arguments: n the number of vertices in the graph, A the adjacency matrix where 
equals 1 if and only if vertex i is adjacent to vertex j, and degree where degree[i] is the number of 
vertices adjacent to vertex i (and is the sum of A[i]). The variables nodes and cpuTime are used 
as measures of search performance, timeLimit is a bound on the run-time, maxSize is the size 
of the largest clique found so far, style is used as a flag to customise the algorithm with respect 
to ordering of vertices and the array solution is the largest clique found such that solution[i] is 
equal to 1 if and only if vertex i is in the largest clique found. 

The method search() finds a largest cliqu^j or terminates having exceeding the allocated 
timeLimit. Two sets are produced: the candidate set P and the current clique Cj^| Vertices from 
P may be selected and added to the growing clique C. Initially all vertices are added to P and C 
is empty (lines 27 and 28). The sets P and C are represented using Java's ArrayList, a re-sizable- 
array implementation of the List interface. Adding an item is an 0(1) operation but removing an 
arbitrary item is of 0(n) cost. This might appear to be a damning indictment of this simple data 
structure but as we will see it is the cost we pay if we want to maintain order in P and in many 
cases we can work around this to enjoy 0(1) performance. 

The search is performed in method expancj^J In line 34 a test is performed to determine if 
the cpu time limit has been exceeded, and if so search terminates. Otherwise we increment the 
number of nodes, i.e. a count of the size of the backtrack search tree explored. The method then 
iterates over the vertices in P (line 36), starting with the last vertex in P down to the first vertex 
in P. This form of iteration over the ArrayList, getting entries with a specific index, is necessary 
when entries are deleted (line 45) as part of that iteration. A vertex v is selected from P (line 38), 
added to C (line 39), and a new candidate set is then created (line 40) newP where newP is the 
set of vertices in P that are adjacent to vertex v (line 41). Consequently all vertices in newP are 
adjacent to all vertices in C and all pairs of vertices in C are adjacent (i.e. C is a clique). If newP 
is empty C is maximal and if it is the largest clique found it is saved (line 42). If newP is not 
empty then C is not maximal and search can proceed via a recursive call to expand (line 43). On 
returning from the recursive call v is removed from P and from C (lines 44 and 45). 

There is one "trick" in expand and that is at line 37: if the combined size of the current clique 
and the candidate set cannot unseat the best clique found so far this branch of the backtrack 
tree can be abandoned. This is the simplest upper bound cut-off and corresponds to line 3 from 
Algorithm 1 in [8]. The method saveSolution does as it says: it saves off the current maximal 
clique and records its size. 

Observations on MC There are several points of interest. This first is the search process itself. 
If we commented out lines 37 and changed line 41 to add to newP all vertices in P other than v, 
method expand would produce the power set of P and at each depth k in the backtrack tree we 
would have (?J calls to expand. That is, expand produces a binomial backtrack search tree of size 
0(2 n ) (see page 6 and 7 of [IT]). This can be compared to a bifurcating search process, where on 
one side we take an element and make a recursive call, and on the other side reject it and make a 
recursive call, terminating when P is empty. This generates the power set on the leaf nodes of the 
backtrack tree and explores 2™ +1 — 1 nodes. This is also 0(2 n ) but in practice is often twice as 
slow as the binomial search. In Figure [2] we see a binomial search produced by a simplification of 
MC, generating the power set of {0, 1, 2, 3}. Each node in the tree contains two sets: the set that 
will be added to the power set and the set that can be selected from at the next level. We see 16 
nodes and at each depth k we have (?) nodes. The corresponding tree for the bifurcating search 
(not shown) has 31 nodes with the power set appearing on the 16 leaf nodes at depth 4. 

The second point of interest is the actual Java implementation. Java gives us an elegant con- 
struct for iterating over collections, the for-each loop, used in line 41 of Listing |l.l| This is rewritten 

1 There may be more than one largest clique so we say we find "a largest clique" 

2 At least two of the published algorithms name the candidate set P, maybe for "potential" vertices, for 
example [8] and more recently [7]. 

3 We use the same name for this method as Tomita :26 27 . 
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import java. util . * ; 

public class MC { 

int [] degree; // degree of vertices 

int [ ] : ] A; // 0/1 adjacency matrix 

int n; // n vertices 

long nodes; // number of decisions 

long timeLimit; // milliseconds 

long cpuTime ; // milliseconds 

int maxSizc ; // size of max clique 

int style ; // used to flavor algorithm 

in t [ ] solution; // as it says 

MC (int n, int [] [] A, int [] degree) { 
t h i s . n — n ; 
this .A — A; 

this. degree — degree; 
nodes — maxSize — 0; 
cpuTime — timeLimit — — 1; 
style — 1 ; 

solution — new int [ n ] ; 

} 

void search ( ) { 

cpuTime — System. currcntTimcMillis () ; 

nodes — 0; 

ArrayList<Intcger> C — new ArrayList<Intcger >() ; 
A r r ay List <I nt c gc r > P — new Ar r ay L ist <I nt eger >(n) ; 
for (int i=0;i<n;i++) P . add ( i ) ; 
expand (C , P) ; 

} 

void expand ( ArrayList <Integer > C , Ar r ay List <I n t c gcr > P) { 

if (timeLimit > &&: System. currcntTimcMillis () — cpuTime >— timeLimit ) return ; 
nodcs++; 

for (int i=P. size () -l;i >=0;i ){ 

if (C.sizcQ +P.sizc() <— maxSize ) return ; 
int v — P.gct(i); 
C.add(v) ; 

ArrayList <Integcr > ncwP — new ArrayList <Integer >() ; 
for (int w : P) if (A[v][w] = 1) ncwP . add (w) ; 
i f (ncwP . isEmpty ( ) && C.sizc() > max Size) saveSolution(C); 
if ( ! newP . isEmpty ( ) ) expand (C, newP ) ; 
C. remove (( Integer )v) ; 



} 



P.rcmovc(( Integer ) v) 

} 



void saveSolut ion ( Array List <Int eger > C) { 
Arrays, fill (solution ,0) ; 
for (int i : C) solution [ i ] — 1; 
maxSizc — C. size () ; 

} 



Listing 1.1. The basic clique solver 
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Fig. 2. A binomial search tree producing the power set of {0, 1, 2, 3}. 



in class MC0 (extending MC, overwriting the expand method) Listing 1.2 lines 15 to 18: one line 
of code is replaced with 4 lines. MC0 gets the j th element of P, calls it w (line 16 of Listing 1.2 1 



and if it is adjacent to v it is added to newP (line 17 of Listing |L2[ ). In MC (line 41 of Listing 1.1 1 
the for-each statement implicitly creates an iterator object and uses that for selecting elements. 
This typically results in a 10% reduction in runtime for MC0. 

Our third point is how we create our sets. In MC0 line 15 the new candidate set is created with 
a capacity of i. Why do that when we can just create newP with no size and let Java work it out 
dynamically? And why size il In the loop of line 10 i counts down from the size of the candidate 
set, less one, to zero. Therefore at line 14 P is of size i + 1 and we can set the maximum size of 
newP accordingly. If we do not set the size Java will give newP an initial size of 10 and when 
additions exceed this newP will be re-sized. By grabbing this space we avoid that. This results in 
yet another measurable reduction in run-time. 

Our fourth point is how we remove elements from our sets. In MC we remove the current vertex 
v from C and P (lines 44 and 45) whereas in MC0 we remove the last element in C and P (lines 
21 and 22). Clearly v will always be the last element in C and P. The code in MC results in a 
sequential scan to find and then delete the last element, i.e. 0(n), whereas in MC0 it is a simple 
0(1) task. And this raises another question: P and C are really stacks so why not use a Java 
Stack? The Stack class is represented using an ArrayList and cannot be initialised with a size, but 
has a default initial size of 10. When the stack grows and exceeds its current capacity the capacity 
is doubled and the contents are copied across. Experiments showed that using a Stack increased 
run time by a few percentage points. 

Typically MC0 is 50% faster than MC. In many cases a 50% improvement in run time would 
be considered a significant gain, usually brought about by changes in the algorithm. Here, such a 
gain can be achieved by moderately careful coding. And this is our first lesson: when comparing 
published results we need to be cautious as we may be comparing programmer ability as much as 
differences in algorithms. 

The fifth point is that MC makes more recursive calls that it needs to. At line 37 |C| -I- \P\ is 
sufficient to proceed but at line 43 it is possible that \C\ + \newP\ is actually too small and will 
generate a failure at line 37 in the next recursive call. We should have a richer condition at line 
43 but as we will soon see the algorithms that follow do not need this. 

The sixth point is a question of space: why is the adjacency matrix an array of integers when 
we could have used booleans, surely that would have been more space efficient? In Java a boolean 
is represented as an integer with 1 being true, everything else false. Therefore there is no saving in 
space and only a minuscule saving in time (more code is generated to test if ^4[i][j] equals 1 than 
to test if a boolean is true). Furthermore by representing the adjacency matrix as integers we can 
sum a row to get the degree of a vertex. 
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And finally, Listing [l . 1 1 shows exactly what is measured. Our run time starts at line 25, at the 
start of search. This will include the times to set up the data structures peculiar to an algorithm, 
and any reordering of vertices. It does not include the time to read in the problem or the time 
to write out a solution. There is also no doubt about what we mean by a node: a call to expand 
counts as one more node. 



import java. util . * ; 

public class MCO extends MC { 

MCO (int n, int [] [] A, int [] degree) { super ( n , A, degree ); } 

void expand ( ArrayList <Integer > C , Ar r ay Lis t <I n t c ger > P) { 

if (timcLimit > && System . c u r r en t T i me M i 1 1 i s ( ) — cpuTimc >— timcLimit) return; 
n o d c s H — h ; 

for (int i=P. size () — l;i >=0;i ){ 

if (C.sizcQ + P.sizc() <— maxSizc ) return ; 
int v — P.gct(i); 
C.add(v) ; 

ArrayList <Integer > newP — new ArrayList <Integer >( i ) ; 
for (int j=0;j<=i ; j++){ 

int w — P . get ( j ) ; 

if (A[v][w] = 1) ncwP.add(w); 

} 

if ( newP . isEmpty () && C.sizc() > maxSizc) savcSolution(C); 

if ( ! newP . isEmpty () ) expand (C, newP ) ; 
C.remove(C. size ()— 1); 
P . remove ( i ) ; 

} 

} 

} 

^ — ' 

Listing 1.2. Inelegant but 50% faster, MCO extends MC 



A trace of MC We now present three views of the MC search process over a simple problem. 
The problem is referred to as glO-50, and is a randomly generated graph with 10 vertices with 
edge probability 0.5. This is shown in Figure [3] and has at top a cartoon of the search process 
and immediately below a trace of our program. The program prints out the arguments C and P 
on each call to expand (between lines 33 and 34), in line 37 if a FAIL occurs, and between lines 
38 and 39 when a vertex is selected. The indentation corresponds to the depth of recursion. The 
Line dd boxes in the cartoon of Figure [3] corresponds to the line numbers in the trace of Figure 



[3j each of those a call to expand. Green coloured vertices are in P, blue vertices are those in C 
and red vertices are those removed from P and C in lines 44 and 45 of Listing |1.1| Also shown 
is the backtrack tree. The boxes correspond to calls to expand and contain C and P. On arcs we 
have numbers with a down arrow J, if that vertex is added to C and an up arrow f if that vertex 
is removed from C and P. A clear white box is a call to expand that is an interior node of the 
backtrack tree leading to further recursive calls or the creation of a new champion. The green 
shriek! is a champion clique and a red shriek! a fail because |C| + \P\ was too small to unseat the 
champion. The blue boxes correspond to calls to expand that fail first time on entering the loop 
at line 36 of Listing |1.1| By looking at the backtrack tree we get a feel for the nature of binomial 
search. 



2.2 MCQ 



We now present Tomita's algorithm MCQ as Listings 1.3 1.4 1.5 and 1.6 But first, a sketch 
of the algorithm. MCQ is at heart an extension of MC, performing a binomial search, with two 
significant advances. First, the graph induced by the candidate set is coloured using a greedy 
sequential colouring algorithm due to Welsh and Powell |30j . This gives an upper bound on the 
size of the clique in P that can be added to C. Vertices in P are then selected in decreasing colour 
order, that is P is ordered in non-decreasing colour order (highest colour last). And this is the 
second advance. Assume we select the i th entry in P and call it v. We then know that we can 
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1 > expandCC: [] ,P: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] 

2 > select 9 C: [] P: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] -> C: [9] k newP: [0, 6, 8] 

3 > > expand (C: [9] ,P: [0, 6, 8] 

4 > > select 8 C:[9] P:[0, 6, 8] -> C:[9, 8] k newP : [6] 

5 > > > expand (C: [9, 8],P:[6] 

6 > > > select 6 C:[9, 8] P:[6] -> SAVE: [9, 8, 6] 

7 > > FAIL: ICI + IPI <= 3 C:[9] P:[0, 6] 

8 > select 8 C:[] P:[0, 1, 2, 3, 4, 5, 6, 7, 8] -> C:[8] &newP:[3, 6] 

9 > > expand (C: [8] ,P: [3, 6] 

10 > > FAIL: ICI + IPI <= 3 C:[8] P:[3, 6] 

11 > select 7 C:[] P:[0, 1, 2, 3, 4, 5, 6, 7] -> C:[7] k newP:[3, 4] 

12 > > expandCC: [7] ,P: [3, 4] 

13 > > FAIL: ICI + IPI <= 3 C:[7] P:[3, 4] 

14 > select 6 C:[] P:[0, 1, 2, 3, 4, 5, 6] -> C:[6] k newP:[4, 5] 

15 > > expandCC: [6] ,P: [4, 5] 

16 > > FAIL: ICI + IPI <= 3 C:[6] P:[4, 5] 

17 > select 5 C:[] P : [0 , 1, 2, 3, 4, 5] -> C:[5] k newP:[0, 3] 

18 > > expandCC: [5] ,P: [0, 3] 

19 > > FAIL: ICI + IPI <= 3 C:[5] P:[0, 3] 

20 > select 4C:[] P : [0 , 1, 2, 3, 4] -> C:[4] &newP:[l, 2] 

21 > > expandCC: [4] ,P: [1, 2] 

22 > > FAIL: ICI + IPI <= 3 C:[4] P:[l, 2] 

23 > select 3 C:[] P:[0, 1, 2, 3] -> C:[3] k newP:[l, 2] 

24 > > expandCC: [3] ,P: [1, 2] 

25 > > FAIL: ICI + IPI <= 3 C:[3] P:[l, 2] 

26 > FAIL: ICI + IPI <= 3 C:[] P:[0, 1, 2] 



[] [0,1,2,3,4,5,6,7,8,9] 




Fig. 3. Cartoon, trace and backtrack-tree for MC on graph glO-50. 



8 Exact Algorithms for Maximum Clique 



colour all the vertices in P corresponding to the entry up to and including the i th entry using 
no more than the colour number of v. Consequently that sub-graph can contain a clique no bigger 
than the colour number of v, and if this is too small to unseat the largest clique search can be 
abandoned. 



MCQ in Java MCQ extends MC, Listing |1.3| line 3, and has an additional instance variable 
colourClass (line 5) such that colourClass[i] is an ArrayList of integers (line 15) and will contain 
all the vertices of colour i + 1 and is used when sorting vertices by their colour (lines 45 to 
64). At the top of search (method search, lines 12 to 21, Listing 1.3 1 vertices are sorted (call to 
orderVertices(P) at line 19) into some order, and this is described later. 

Method expand (line 23 to 43) corresponds to Figure 2 in [26]. The array colour is local to 
the method and holds the colour of the i th vertex in P. The candidate set P is then sorted in 
non-decreasing colour order by the call to number Sort in line 28, and colour[i] is then the colour 
of integer vertex P.get(i). The search then begins in the loop at line 29. We first test to see if the 
combined size of the candidate set plus the colour of vertex v is sufficient to unseat the champion 
(the largest clique found so far J] (line 30). If it is insufficient the search terminates. Note that 
the loop starts at m — 1, the position of the last element in P, and counts down to zero. The i th 
element of P is selected and assigned to v. As in MC we create a new candidate set newP, the 
set of vertices (integers) in P that are adjacent to v (lines 33 to 37). We then test to see if C is 
maximal (line 38) and if it unseats the champion. If the new candidate set is not empty we recurse 
(line 39). Regardless, v is removed from P and from C (lines 40 and 41) as in (lines 21 and 22 of 
MCO). 

Method number Sort can be compared to Figure 3 in |26j . number Sort takes as arguments 
an ordered ArrayList of integers ColOrd corresponding to vertices to be coloured in that order, 
an ArrayList of integers P that will correspond to the coloured vertices in non-decreasing colour 
order, and an array of integers colour such that if v = P.get(i) (v is the i th vertex in P) then 
the colour of v is colour[i}. Lines 45 to 64 differs from Tomita's NUMBER-SORT method because 
we use the additional arguments ColOrd and the growing clique C as this allows us to easily 
implement our next algorithm MCS. 

Rather than assign colours to vertices explicitly number Sort places vertices into colour classes, 
i.e. if a vertex is not adjacent to any of the vertices in colour C las s[i] then that vertex can be placed 
into that class and given colour number i + 1 (i + 1 so that colours range from 1 upwards) . The 
vertices can then be sorted into colour order via a pigeonhole sort, where colour classes are the 
pigeonholes. 

number Sort starts by clearing out the colour classes that might be used (line 48). In lines 49 
to 55 vertices are selected from ColOrd and placed into the first colour class in which there are no 
conflicts, i.e. a class in which the vertex is not adjacent to any other vertex in that class (lines 51 
to 53, and method conflicts). Variable colours records the number of colours used. Lines 56 to 63 
is in fact a pigeonhole sort, starting by clearing P and then iterating over the colour classes (loop 
start at line 58) and in each colour class adding those vertices into P (lines 59 to 63). The boolean 
method conflicts, lines 66 to 72, takes a vertex v and an ArrayList of vertices colourClass where 
vertices in colourClass are not pair-wise adjacent and have the same colour i.e. the vertices are 
an independent set. If vertex v is adjacent to any vertex in colourClass the method returns true 
(lines 67 to 70) otherwise false. Note that if vertex v needs to be added into a new colour class in 
numberSort the size of that colourClass will be zero, the for loop of lines 67 to 70 will not be 
performed and conflicts returns true. The complexity of numberSort is quadratic in the size of 
P. 

Vertices need to be sorted at the top of search, line 19. To do this we use the class Vertex 
in Listing |1.4| and the comparator MCRComparator in Listing |1.5| If i is a vertex in P then the 
corresponding Vertex v has an integer index equal to i. The Vertex also has attributes degree and 
nebDeg. degree is the degree of the vertex index and nebDegree is the sum of the degrees of the 
vertices in the neighbourhood of vertex index. Given an array V of class Vertex this can be sorted 



4 This is the terminology used by Pablo San Segundo in a private email communication. 
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import java. util . * ; 
class MCQ extends MC { 

ArrayList [] colourClass; 

MCQ (int n, int [] [] A, int [] degree , int style) { 
super (n, A, degree) ; 
this, style — style; 

} 



void search ( ) { 
cpuTime 
nodes 

colourClass 
ArrayList <Intcgcr> C 
ArrayList <Integer> P 

for (int i— 0;i<n;i++) c o 1 o u r C 1 as s [ i ] 
ordcrVcrticcs (P) ; 
expand (C , P) ; 

} 



— System. currcntTimeMillis () 

= 0; 

— new ArrayList [n] ; 

— new ArrayList<Integer >(n) 

— new ArrayList<Integer > ( n ) 

— new ArrayList<Intcgcr > ( n ) 



void expand (ArrayList<Integer> C,ArrayList<Integcr> P) { 

if (timcLimit > && System. currcntTimeMillis ( ) — cpuTime >— timeLimit ) return ; 
nodcsH — h; 

int m — P. size () ; 
int [] colour — new int [m] ; 
numberSort (P,P, colour) ; 
for (int i=m-l;i >=0;i ){ 

if (C.sizcQ -j- colour [i] <— max Size ) return ; 

int v — P.gct(i); 
C. add(v) ; 

ArrayList <Integer > newP — new ArrayList <Integer >(i ) ; 
for (int j=0;j<=i ; j++){ 

int u = P. get ( j ) ; 

if (A[u][v] = 1) ncwP.add(u); 

} 

i f (newP . isEmpty ( ) && C.sizc() > max Size) saveSolution(C); 
if ( ! newP . isEmpty() ) expand (C, newP ) ; 
C. remove (C. size ()— 1); 
P . remove ( i ) ; 



} 



} 



void numberSort ( ArrayList<Integcr > C,ArrayList<Integer> ColOrd,ArrayList<Intcgcr> P, 
int [ ] colour ) { 
int colours — 0; 
int m— ColOrd.sizc(); 

for (int i — ; i <in; H — h) colourClass [ i ] . clear () ; 
for (int i=0;i<in; i++){ 

int v = ColOrd . get ( i ) ; 

int k = 0; 

while ( conflicts (v, colourClass [ k ] ) ) k++; 

colourClass [k] . add ( v ) ; 

colours — Math . max( colours , k + 1) : 

} 

P. clear () ; 
int i = 0; 

for (int k — 0; k<c o lours ; kH — h) 

for (int j =0;j <colour Class [k ]. s i z e (); j ++){ 

int v — ( Int eger )( colour Class [k ]. get ( j )) ; 
P . add ( v ) ; 

colour [ i++] — k + 1; 

} 

} 

boolean conflicts (int v,ArrayList<Integer> colourClass ){ 
for (int i— 0;i<colourClass . size () ; i H — 1-){ 
int w— colourClass. gct(i); 
if (A [ v ] [ w] = 1 ) return true; 

} 

return false ; 

} 



Listing 1.3. MCQ (part 1), Tomita 2003 
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using Java's Arrays. sort(V) method in 0(n.log(n)) time, and is ordered by default using the 
compareTo method in class Vertex. Our method forces a strict ordering of V by non-increasing 
degree, tie-breaking on index. And this is one step we take to ensure reproducibility of results. 
If we allowed the convpareToMeihod to deliver when two vertices have the same degree then 
Arrays. sort would then break ties. If the sort method was unstable, i.e. did not maintain the 
relative order of objects with equal keys [3j, results may be unpredictable. 



import java. util . * ; 

public class Vertex implements Comparable<Vcrtcx> { 

int index , degree , ncbDcg ; 

public Vertex (int index, int degree) { 
this, index — index; 
this. degree — degree; 
nebDeg — ; 

} 

public int compareTo( Vertex v ){ 

if (degree < v. degree || degree = v . degree &&; index > v. index) return 1; 
return — 1; 

} 



Listing 1.4. Vertex 



The class MCRComparator (Listing 1.5) allows us to sort vertices by non-increasing degree, 
tie breaking on the sum of the neighbourhood degree nebDeg and then on index, giving again a 
strict order. This is the MCR order given in [25], where MCQ uses the simple degree ordering and 
MCR is MCQ with tie-breaking on neighbourhood degree. 



import java. util . * ; 

public class MCRComparator implements Comparator { 

public int compare ( Object ol , Object o2){ 
Vertex u — (Vertex) ol; 
Vertex v — (Vertex) o2; 
if (u. degree < v. degree | 

u . degree = v . degree Sz&z u. ncbDcg < v. ncbDcg | 

u . degree = v . degree && u . ncbDcg = v . nebDeg && u. index > v. index) return 1; 
return — 1; 

} 

} 

Listing 1.5. MCRComparator 



Vertices can also be sorted into a minimum- width order via method minWidthOrder of lines 
86 to 98 of Listing 1.6 The minimum width order (mwo) was proposed by Freuder [3] and also by 
Matula and Beck [16 where it was called "smallest last" , and more recently in [7] as a degeneracy 
ordering. The method minWidthOrder , lines 86 to 98 of Listing [l.6| sorts the array V of Vertex 
into a mwo. The vertices of V are copied into an ArrayList L (lines 87 and 89). The while loop 
starting at line 90 selects the vertex in L with smallest degree (lines 91 and 92) and calls it v. 
Vertex v is pushed onto the stack S and removed from L (line 93) and all vertices in L that are 
adjacent to v have their degree reduced (line 94). On termination of the while loop vertices are 
popped off the stack and placed back into V giving a minimum width (smallest last) ordering. 

Method orderVertices (Listing [l.6| lines 74 to 84) is then called once, at the top of search. The 
array of Vertex V is created for sorting in lines 75 and 76, and the sum of the neighbourhood degrees 
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void orderVertices(ArrayList<Integer> ColOrd){ 
Vertex [] V — new Vertex [n] ; 

for (int i — ; i <n ; i ~\ — h) V [ i ] — new Vertex (i , degree [i]) ; 
for (int i=0;i<n; 

for (int j=0;j<n;j++) 

if (A[ i ] [ j ] = 1) V[ i ] . ncbDcg = V[ i ] . ncbDcg + degree [ j 
if (style = 1) Arrays . sort (V) ; 
if (style = 2) minWidthOrder (V) ; 

if (style = 3) Arrays . sort (V, new MCRComparator ( ) ) ; 
for (Vertex v : V) ColOrd. add (v. index) ; 

} 

void minWidthOrder ( Vertex [] V) { 

ArrayList<Vertex> L — new Array List <Vertex > (n ) ; 
St ack<Vertcx> S — new Stack<Vcrtex > () ; 
for (Vertex v : V) L.add(v); 
while ( ! L . is Empty ( ) ) { 

Vertex v — L.gct(O); 

for (Vertex u : L) if (u. degree < v. degree) v — u; 
S.push(v) ; L.remove(v) ; 

for (Vertex u : L) if ( A [ u . index] [v. index] = 1) u. degree — 

} 

int k = 0; 

while ( ! S . isEmpty () ) V[k++] = S . pop ( ) ; 

} 



Listing 1.6. MCQ (part 2), Tomita 2003 



is computed in lines 77 to 79. ColOrd is then sorted in one of three orders: style == 1 in non- 
increasing degree order, style == 2 in minimum width order, style == 3 in non-increasing degree 
tie-breaking on sum of neighbourhood degree. MCQ then uses the ordered candidate set P for 
colouring, initially in one of the initial orders, thereafter in the order resulting from numberSort 
and that is non-decreasing colour order. In |26j it is claimed that this is an improving order 
(however, no evidence was presented for this claim). In 25 Tomita proposes a new algorithm, 
MCR, where MCR is MCQ with a different initial ordering of vertices. Here MCR is MCQ with 
style = 3. 



A trace of MCQ Figure[2]shows a cartoon and trace of MCQ over graph glO-50. Print statements 



were placed immediately after the call to expand (Listing 1.3 line 24), after the selection of a vertex 
v (line 31) and just before v is rejected from P and C (line 40). Each picture in the cartoon gives 
the corresponding line numbers in the trace immediately below. Line of the trace is a print out 
of the ordered array V just after line 83 in method orderV ertices in Listing [l.6| This shows for 
each vertex the pair < index, degree >: the first call to expand has P = {3, 0, 4, 6, 1, 2, 5, 8, 9, 7}, 
i.e. non-decreasing degree order. MCQ makes 3 calls to expand whereas MC makes 9 calls, and 



the MCQ colour bound cut off in line 30 of Listing 1.3 is satisfied twice (listing lines 9 and 11) 



Observations on MCQ We noted above that MC can make recursive calls that immediately fail. 



Can this happen in MCQ? Looking at lines 39 of Listing 1.3 |C| + |newP| must be greater than 
maxSize. Since the colour of the vertex selected colour[i] was sufficient to satisfy the condition 
of line 30 it must be that integer vertex v (line 31) is adjacent to at least colour[i] vertices in P 
and thus in newP, therefore the next recursive call will not immediately fail. Consequently each 
call to expand corresponds to an internal node in the backtrack tree. 

We also see again exactly what is measured as cpu time: it includes the creation of our data 
structures, the reordering of vertices at the top of search and all recursive calls to expand (lines 
12 to 20). 

Why is colourClass an ArrayList[] rather than an ArrayList<ArrayList<Integer>> That 
would have done away with the explicit cast in line 60. When using an ArrayList<ArrayList<Integer> 
we can do away with the cast but Java generates an implicit cast, so nothing is gained. 
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Topofsearch LineL&2 Line3&4 Line5&6 



<3,5> <0,4> <4,4> <6,4> <1,3> <2,3> <5,3> <8,3> <9,3> <7,2> 

1 > expandCC: [] ,P: [3, 0, 4, 6, 1, 2, 5, 8, 9, 7] 

2 > select 9 C:[] P:[3, 0, 4, 6, 1, 2, 7, 5, 8, 9] -> C:[9] k newP:[0, 6, 8] 

3 > > expandCC: [9] ,P: [0, 6, 8] 

4 > > select 8 C:[9] P:[0, 6, 8] -> C:[9, 8] k newP:[6] 

5 > > > expandCC: [9, 8],P:[6] 

6 > > > select 6 C:[9, 8] P : [6] -> SAVE : [9, 8, 6] 

7 > > > reject 6 C:[9, 8, 6] P:[6] 

8 > > reject 8 C: [9, 8] P: [0, 6, 8] 

9 > > select 6 C:[9] P:[0, 6] -> FAIL: vertex 6 colour too small Ccolour = 1) 

10 > reject 9 C:[9] P:[3, 0, 4, 6, 1, 2, 7, 5, 8, 9] 

11 > select 8 C:[] P:[3, 0, 4, 6, 1, 2, 7, 5, 8] -> FAIL: vertex 8 colour too small Ccolour = 3) 

Fig. 4. Trace of MCQ1 on graph glO-50. 

At the top of MCQ's search Tomita sorts vertices in non-increasing degree order and the first 
A vertices are given colours 1 to A respectively, where A is the maximum degree in the graph, 
thereafter vertices are given colour A + 1. This is done to prime the candidate set for the initial 
call to EXPAND. Thereafter Tomita calls NUMBER-SORT immediately before the recursive call 
to EXPAND. A simpler option is taken here: colouring and sorting of the candidate set is done at 
the start of expand (call to number Sort at line 28). This strategy is adopted in all the algorithms 
here. 

2.3 MCS 

Tomita's MCR [25J is MCQ with a richer initial ordering, essentially non-decreasing degree with 
tie-breaking on the sum of the degrees of adjacent vertices. This ordering is then modified during 
search via the colouring routine number Sort as in MCQ. MCR is compared to MCQ in [S3] over 8 
of the 66 instances of the DIMACS benchmarks [Tj showing an improvement in MCR over MCQ. 
As previously stated, MCR is our MCQ with style = 3. 

MCS [27 is MCR with two further modifications. The first modification is that MCS uses "... 
an adjunct ordered set of vertices for approximate coloring" . This is an ordered list of vertices to 
be used in the sequential colouring, and was called V a . This order is static, set at the top of search. 
Therefore, rather than use the order in the candidate set P for colouring the vertices in P the 
vertices in P are coloured in the order of vertices in V a - 

The second modification is to use a repair mechanism when colouring vertices (this is called 
a Re-NUMBER in Figure 1 of [27]). When colouring vertices an attempt is made to reduce the 
colours used by performing exchanges between vertices in different colour classes. This is similar 
to the Sequential-X algorithm of Maffray and Preissmann |15j . i.e. a bi- chromatic exchange is 
performed between a subset of vertices in a pair of colour classes indifferent to a given vertex. In 
[27j a recolouring of a vertex v occurs when a new colour class is about to be opened for v and 
that colour class exceeds the search bound, i.e. if the number of colours can be reduced this could 
result in search being cut off. In the context of colouring I will say that vertex u and v conflict if 
they are adjacent, and that v conflicts with a colour class C if there exists a vertex u £ C that 
is in conflict with v. Assume vertex v is in colour class Cfc. If there exists a lower colour class Ci 
(i < k — 1) and v conflicts only with a single vertex w € C, and there also exists a colour class 
Cj, where i < j < k, and w does not conflict with any vertex in Cj then we can place v in Cj and 
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w in Cj, freeing up colour class C\. This is given in Figure 1 of |27) and the procedure is named 
Re-NUMBER. 

Figure [5] illustrates this procedure. The boxes correspond to colour classes i, j and k where 
i < j < k. The circles correspond to vertices in that colour class and the red arrowed lines as 
conflicts between pairs of vertices. Vertex v has just been added to colour class k, v conflicts only 
with w in colour class i and w has no conflicts in colour class j. We can then move w to colour 
class j and v to colour class i. 




Fig. 5. A repair scenario with colour classes i, j and k.. 

Experiments were then presented in [27J comparing MCR against MCS in which MCS is always 
the champion. But it is not clear where the advantage of MCS comes from: does it come from the 
static colour order (the "adjunct ordered set") or docs it come from the colour repair mechanism? 

I now present two versions of MCS. The first I call MCSa and this uses the static colouring 
order. The second, MCSb, uses the static colouring ordering and the colour repair mechanism (so 
MCSb is Tomita's MCS). Consequently we will be able to determine where the improvement in 
MCS comes from: static colour ordering or colour repair. 

MCSa in Java In Listing [L"7| we present MCSa as an extension to MCQ. Method search creates 
an explicit colour ordering ColOrd and the expand method is called with this in line 18 (compare 
this to line 20 of MCQ). Method expand now takes three arguments: the growing clique C, the 
candidate set P and the colouring order ColOrd. In line 26 numberSort is called using ColOrd 
(compare to line 28 in MCQ) and lines 27 to 45 are essentially the same as lines 29 to 42 in MCQ 
with the exception that ColOrd must also be copied and updated (line 32, 36 and 37) prior to the 
recursive call to expand (line 40) and then down-dated after the recursive call (line 43). Therefore 
MCSa is a simple extension of MCQ and like MCQ has three styles of ordering. 

MCSb in Java In Listing [L8| we present MCSb as an extension to MCSa: the difference between 
MCSb and MCSa is in numberSort, with the addition of lines 10 and 20. At line 10 we compute 
delta as the minimum number of colour classes required to match the search bound. At line 20, 
if we have exceeded the number of colour classes required to exceed the search bound and a new 
colour class k has been opened for vertex v and we can repair the colouring such that one less 
colour class is used we can decrement the number of colours used. This repair is done in the 
boolean method repair of lines 43 to 57. The repair method returns true if vertex v in colour class 
k can be recolourcd into a lower colour class, false otherwise, and can be compared to Tomita's 
Re-NUMBER procedure. We search for a colour class i, where i < k— 1, in which there exists only 
one vertex in conflict with v and we call this w (line 45). The method getSingleConflictVariable, 
lines 32 to 41, takes as arguments a vertex v and a colour class and counts and records the conflict 
vertex. If this exceeds 1 we return a negative integer otherwise we deliver the single conflicting 
vertex. The repair method then proceeds at line 46 if a single conflicting vertex w was found, 
searching for a colour class j above i (for loop of line 47) in which there are no conflicts with w. If 
that was found (line 48) vertex v is removed from colour class k, w is removed from colour class i, 
v is added to colour class i and w to colour class j (lines 49 to 52) and repair delivers true (line 
53). Otherwise, no repair occurred (line 56). 



14 Exact Algorithms for Maximum Clique 



import java. util . * ; 

class MCSa extends MOQ { 

MCSa (int n , int [ ] [ ] A, int [ ] degree, int style) { 
super(n, A, degree , style) ; 

} 



— System, cur rentTimeMillis() 
= 0; 

— new ArrayList [n] ; 

— new ArrayList<Integer >(n) 

— new ArrayList<Integcr > ( n ) 

— new ArrayList<Intcgcr > ( n ) 

— new ArrayList<Intcgcr > ( n ) 



void search ( ) { 
cpuTime 
nodes 

colour Class 
ArrayList <Integer> C 
ArrayList <Integer> P 
ArrayList <Integer> ColOrd 
for (int i— 0;i<n;i++) c o 1 o u r C 1 as s [ i ] 
ordcrVcrticcs (ColOrd) ; 
expand (C,P, ColOrd) ; 

} 

void expand ( ArrayList <Intcgcr > C , ArrayList <I nt e ger > P , ArrayList <Intcger > ColOrd){ 

if (timcLimit > && System. currcntTimcMillis ( ) — cpuTime >— timeLimit ) return ; 
nodcsH — h; 

int m— ColOrd. size (); 

int [] colour — new int [m] ; 

numbcrSort (C, ColOrd ,P, colour ) ; 

for (int i=m-l;i>=0;i ){ 

int v — P . get ( i ) ; 

if (C.sizcQ + colour [i] <— max Size ) return ; 
C.add(v); 

ArrayList <Integer > ncwP — new ArrayList <Integer >(i ) ; 

ArrayList <Integer> newColOrd — new ArrayList <Integer >(i ) ; 
for (int j=0;j<=i ; j++){ 
int u — P . get ( j ) ; 
if (A[u][v] = 1) ncwP.add(u); 
int w— ColOrd. get (j); 



} 



if (A[v][w] = 1) newColOrd . add (w) ; 



if ( ncwP . isEmpty() & 
if ( ! ncwP . isEmpty () ) 
C.rcmovc(C. size ()— 1) 
P . remove ( i ) ; 

ColOrd . remove (( Integer ) v) ; 



C.sizc() > maxSizc) saveSolution(C) 
expand (C, newP , newColOrd ) ; 



Listing 1.7. MCSa, Tomita 2010 
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import java.util.*; 

class MCSb extends MCSa { 

MCSb (int n , int [ ] [ ] A, int [ ] degree, int style) { 
super(n, A, degree , style) ; 

} 

void numbcrSort (ArrayList<Intcgcr> C,ArrayList<Integer> ColOrd,ArrayList<Integcr> P, 
int [ ] colour ) { 

int delta — maxSizc — C.sizeQ; 

int colours — ; 

intm — ColOrd . s i z c ( ) ; 

for (int i — 0; i <in; i ++) colourClass [ i ] . clear () ; 
for (int i=0;i<*n; i++){ 

int v = ColOrd . get ( i ) ; 

int k = 0; 

while ( conflicts (v, colourClass [k] ) ) kH — h; 

colourClass [k] . add ( v ) ; 

colours — Math . max( colours , k + 1) ; 

if (k+1 > delta && c o 1 o ur C 1 as s [ k ] . s i z e ( ) = 1 && repair (v,k) ) colours ; 

} 

P. clear () ; 
int i = 0; 

for (int k — ; k< colours ; kH — h) 

for (int j =0;j <colour Class [k ]. s i z e (); j ++){ 

int v — ( I nt cger ) ( colo ur C 1 ass [ k ] . get ( j ) ) ; 
P.add(v); 

colour [ i++] — k + 1; 

} 

} 

int getSingleConflictVariablc(int v,ArrayList<Integer> colourClass ){ 
int conflictVar — — 1; 
int count — ; 

for (int i— 0;i<colourClass , size () SzSz count < 2 ; i++){ 
int w— colourClass. gct(i); 

if (A[v][w] = 1 ) { co n f lie t Var — w; count+ + ;} 

} 

if (count > 1) return —count; 
return conflictVar ; 

} 

boolean repair (int v, int k){ 
for (int i=0;i<k-l;i++){ 

int w— getSingleConflictVariablc (v, colourClass [i]); 
if (w >= 0) 

for (int j=i+l;j<k; j ++) 

if (! conflicts (w, colourClass [j ]) ){ 

colourClass [k] . remove (( Integer ) v ) ; 
colourClass [ i ] . remove (( Integer )w) ; 
colourClass [ i ] . add ( v ) ; 
colourClass [j ] . add (w) ; 
return true ; 

} 

} 

return false ; 



Listing 1.8. MCSb, Tomita 2010 
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Observations on MCS Tomita did not investigate where MCS's improvement comes from and 
neither did [5], coding up MCS in Python in one piece. We can also tune MCS. In MCSb we repair 
colourings when we open a new colour class that exceeds the search bound. We could instead 
repair unconditionally every time we open a new colour class, attempting to maintain a compact 
colouring. We do not investigate this here. 

2.4 BBMC 

San Segundo's BB-MaxClique algorithm [33] (BBMC) is similar to the earlier algorithms in that 
vertices are selected from the candidate set to add to the current clique in non-increasing colour 
order, with a colour cut-off within a binomial search. BBMC is at heart a bit-set encoding of 
MCSa with the following features. 

1. The "BB" in "BB-MaxClique" is for "Bit Board". Sets are represented using bit strings. 

2. BBMC colours the candidate set using a static sequential ordering, the ordering set at the top 
of search, the same as MCSa. 

3. BBMC represents the neighbourhood of a vertex and its inverse neighbourhood as bit strings, 
rather than use an adjacency matrix and its compliment 

4. When colouring takes place a colour class perspective is taken, determining what vertices 
can be placed in a colour class together, before moving on to the next colour class. Other 
algorithms (e.g [26127) ) takes a vertex perspective, deciding on the colour of a vertex. 

The candidate set P is encoded as a bit string, as is the currently growing clique C. For a given 
vertex v we also have its neighbourhood N[v], again encoded as a bit-string, and its inverse 
neighbourhood invN[v]. That is, invN[v] defines the set of vertices that are not adjacent to v 
(invN[v] is the compliment of iV[?;]) and this is used in colouring. When a vertex v is selected 
from the candidate set P to be added to C a new candidate set is produced newP, where newP 
is the set of vertices in the candidate set P that are adjacent to v i.e. newP = P A N[v}. For the 
set elements that reside in word boundaries this operation is fast. 

BBMC takes a "colour-class" perspective. BBColour starts off building the first colour class, 
i.e. the first set of vertices in P that form an independent set. It then finds the next colour class, 
and so on until P is exhausted. Given a set of vertices Q, when a vertex v is selected and removed 
from Q and added to a colour class, Q becomes the set of vertices that are in Q but not adjacent to 
v. That is Q = Q AinvN[v]. Colour classes are then combined using a pigeonhole sort delivering a 
list of vertices in non-decreasing colour order and this is then used in the BBMaxClique method 
(the BBMC equivalent of expand) to cut off search as in MCQ and MCSa. 

MCSa colours vertices in a static order. This is achieved in BBMC by a renaming of the 
vertices, and this is done by re-ordering the adjacency matrix at the top of search. 

BBMC in Java We implement sets using Java's BitSet class, therefore from now on we refer to P 
as the candidate BitSet and an ordered array of integers U as the ordered candidate set. In Listing 
|1.9[ lines 5 to 7, we have the an array of BitSet N for representing neighbourhoods, invN as the 
inverse neighbourhoods (the compliment of N) and V an array of Vertex. N[i] is then a BitSet 
representing the neigbourhood of the i th vertex in the array V, and invN as its compliment. The 
array V is used at the top of search for renaming vertices (and we discuss this later). 

The search method (lines 16 to 30) creates the candidate BitSet P, current clique (as a BitSet) 
C, and Vertex array V. The orderV ertices method renames the vertices and will be discussed later. 
The method BBMaxClique corresponds to the procedure in Figure 3 of [22j and can be compared 
to the expand method in Listing [l.7[ In a BitSet we use cardinality rather than size (line 35, 40 
and 44) . The integer array U (same name as in [22] is essentially the colour ordered candidate set 
such that if v = U[i] then colour[i] corresponds to the colour given to v and colour[i] < colour[i+l]. 
The method call of line 38 colours the vertices and delivers those colours in the array colour and 
the sorted candidate set in U. The for loop, lines 39 to 47 (again, counting down from m — 1 to 
zero), first tests to see if the colour cut off occurs (line 40) and if it does the method returns. 
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Otherwise a new candidate BitSet is created, newP on line 41, as a clone of P. The current vertex 
v is then selected (line 42) and in line 43 v is added to the growing clique C and newP becomes 
the BitSet corresponding to the vertices in the candidate BitSet that are in the neighbourhood 
of v. The operation newP.and(N\v]) (line 43) is equivalent to the for loop in lines 34 to 37 of 
Listing [T3| of MCQ. If the current clique is both maximal and a maximum it is saved via BBMC's 
specialised save method (described later) otherwise if C is not maximal (i.e. newP is not empty) 
a recursive call is made to BBMaxClique. Regardless, v is removed from the current candidate 
BitSet and the current clique (line 46) and the for loop continues. 

Method BBColour corresponds to the procedure of the same name in Figure 2 of [22 but differs 
in that it does not explicitly represent colour classes and therefore does not require a pigeonhole 
sort as in San Segundo's description. Our method takes the candidate BitSet P (see line 38), 
ordered candidate set U and array of colour as parameters. Due to the nature of Java's BitSet the 
and operation is not functional but actually modifies bits, consequently cloning is required (line 
51) and we take a copy of P. In line 52 colourClass records the current colour class, initially zero, 
and i is used as a counter for adding coloured vertices into the array U. The while loop, lines 54 
to 64, builds up colour classes whilst consuming vertices in copyP. The BitSet Q (line 56) is the 
candidate BitSet as we are about to start a new colour class. The while loop of lines 57 to 64 builds 
a colour class: the first vertex in Q is selected (line 58) and is removed from the candidate BitSet 
copyP (line 59) and BitSet Q (line 60) , Q then becomes the set of vertices that are in the current 
candidate BitSet (Q) and in the inverse neighborhood of v (line 61), i.e. Q becomes the BitSet of 
vertices that can join the same colour class with v. We then add v to the ordered candidate set 
U (line 62), record its colour and increment our counter (line 63). When Q is exhausted (line 57) 
the outer while loop (line 54) starts a new colour class (lines 55 to 64). 

Listing 1.10 shows how the candidate BitSet is renamed/reordered. In fact it is not the candi- 
date BitSet that is reordered, rather it is the description of the neighbourhood N and its inverse 
invN that is reordered. Again, as in MCQ and MCSa a Veretx array is created (lines 69 to 73) 
and is sorted into one of three possible orders (lines 74 to 76). Once sorted, a bit in position i 
of the candidate BitSet P corresponds to the integer vertex v = V[i]. index. The neighbourhood 
and its inverse are then reordered in the loop of lines 77 to 83. For all pairs (i,j) we select the 
corresponding vertices u and v from V (lines 79 and 80) and if they are adjacent then the j th bit 
of N[i] is set true, otherwise false (line 81). Similarly, the inverse neighbourhood is updated in 
line 82. The loop could be made twice as fast by exploiting symmetries in the adjacency matrix 
A. In any event, this method is called once, at the top of search and is generally an insignificant 
contribution to run time. 

BBMC requires its own saveSolution method (lines 86 to 90 of Listing 1.10) due to C being 
a BitSet. Again the solution is saved into the integer array solution and again we need to use the 
Vertex array V to map bits to vertices. This is done in line 88: if the i th bit of C is true then 
integer vertex V[i).index is in the solution. This explains why V is global to the BBMC class. 



Observations on BBMC In our Java implementation we might expect a speed up if we did 
away with the in-built BitSet and did our own bit-sting manipulations explicitly. It is also worth 
noting that in [3T] comparisons are made with Tomita's results in [57] by re-scaling tabulated 
results, i.e. Tomita's code was not actually run. This is not unusual. 

In [2T] there is the bit-board version, BB_ReCol in Fig 1, of Tomita's Re-NUMBER. I believe 
that version reported is flawed and can result in colour classes not being pair-wise disjoint, and 
that Fig. 1 should have a return statement between lines 7 and 8, similar to the return statement 
in Rc-NUMBER. As it stands BB_ReCol can result in the candidate set becoming a multi-set 
resulting in redundant re-exploration of the search space with subsequent poor performance. 



2.5 Summary of MCQ, MCR, MCS and BBMC 

Putting aside the chronology 26 25 27 22 , MCSa is the most general algorithm presented here. 
BBMC is in essence MCSa with a BitSet encoding of sets. MCQ is MCSa except that we do away 
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import j ava . u t i 1 . * ; 

public class BBMC extends MOQ { 

BitSct [] N; // neighbourhood 

BitSet [] invN ; // inverse neighbourhood 
Vertex [] V; // mapping bits to vertices 

BBMC (int n , int [ ] [ ] A, int [ ] degree , int style) { 
super (n, A, degree , style) ; 
N = new BitSet [ n ] ; 
invN = new BitSct [n] 
V — new Vertex [ n ] : 

} 

void search ( ) { 

cpuTimc — System, c ur r e n t T i m e M i 1 1 i s () ; 

nodes — 0; 

BitSet C = new BitSet (n) ; 

BitSet P = new BitSet (n) ; 

for (int i=0;i<n; i++){ 

N [ i ] = new B i t S e t ( n ) ; 

invN [ i ] = new BitSet (n); 

V[ i ] — new Vertex(i , degree [ i ] ) ; 

} 

orderVcrticcs () ; 
C. set (0 ,11, false ) ; 
P. set (0,n,true) ; 
BBMaxCliquc(C.P) ; 

} 

void BBMaxCliquc( BitSct C, BitSct P) { 

if (timcLimit > &&: System. currcntTimcMillisf) — cpuTimc >— timcLimit ) return ; 
nodcsH — h; 

int m— P. cardinality (); 

int [] U — new int [m] ; 

int [] colour — new int [m] ; 

BBColour(P,U, colour ) ; 

for (int i=m-l;i >=0;i ){ 

if (colour [i] + C. cardinality () <— maxSize ) return ; 

BitSct ncwP = ( BitSct )P. clone () ; 

int v = U[ i ] ; 

C. set (v, true) ; ncwP . and (N [ v ] ) ; 

if ( ncwP . isEmpty () && C. cardinality () > maxSize ) saveSolution(C); 
if ( ! ncwP . isEmpty () ) BBMaxClique (C , ncwP ) ; 
P. set (v, false) ; C. set (v, false) : 

} 

} 

void BBColour ( BitSet P,int[] U,int[] colour)) 
BitSet copyP = ( Bit Set )P . clone () ; 
int colourClass — 0; 
int i = 0; 

while ( copyP . c ar d in al i t y ( ) !— 0){ 
colourClassH — h 

BitSet Q= ( BitSet ) copyP . clone () ; 
while (Q. cardinality () != 0){ 

int v = Q. nextSetBit (0) ; 

copyP. set (v, false) ; 

Q. set (v, false) ; 

Q. and(invN [v] ) ; 

U[i] = v; 

colour [ i++] — colourClass; 

} 

} 

} 

Listing 1.9. San Segundo's BB-MaxClique in Java (part 1) 
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void o r d e r Ve r t i c c s ( ) { 

for (int i=0;i<n; i++){ 

V [ i ] — new Vcrtcx(i , degree [i]) ; 
for (int j=0;j<n; j++) 

if (A[ i ] [ j ] = 1) V[ i ] . ncbDog = V[ i ] . ncbDog + degree [ j ] : 

} 

if (style = 1) Arrays, sort (V); 
if (style = 2) minWidthOrder(V) ; 

if (style = 3) Arrays .sort (V,new MCRComparator ( ) ) ; 
for (int i=0;i<n;i++) 

for (int j=0;j<n; j++){ 

int u — V[ i ] . index ; 

int v— V[j]. index; 

N[ i ] . set (j ,A[u] [ v] = 1) ; 

invN [i].set(j,A[u][v] = 0): 

} 

} 

void saveSolution(BitSct C){ 
Arrays, fill (solution ,0) ; 

for (int i=0;i<C. size () ; i++) if (C.gct(i)) so lut ion [V [ i ]. index ] = 1; 
maxSizc — C. cardinality () ; 

} 

} 

Listing 1.10. San Segundo's BB-MaxClique in Java (part 2) 



with the static colour ordering and allow MCQ to colour and sort the candidate set using the 
candidate set, somewhat in the manner of Uroborus the serpent that eats itself. And MCSb is 
MCSa with an additional colour repair step. Therefore we might have an alternative hierarchy of 
the algorithms, such as the hierarchy of Figure [6] within the image of Uroborus. 




Fig. 6. An alternative hierarchy of the algorithms. 



3 Exact Algorithms for Maximum Clique: a brief history 

We now present a brief history of complete algorithms for the maximum clique problems, starting 
from 1990. The algorithms are presented in chronological order. 

1990: In 1990 [6] Carraghan and Pardalos present a branch and bound algorithm. Vertices are 
ordered in non-decreasing degree order at each depth in the binomial search with a cut-off based 
on the size of the largest clique found so far. Their algorithm is presented in Fortran 77 along with 
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code to generate random graphs, consequently their empirical results are entirely reproducible. 
Their algorithm is similar to MC (Listing [Li]) but sorts the candidate set P using current degree 
in each call to expand. 

1992: In 18J Pardalos and Rodgers present a zero-one encoding of the problem where a vertex v 
is represented by a variable x v that takes the value 1 if search decides that v is in the clique and 
if it is rejected. Pruning takes place via the constraint ^adjacent(u,v) — > x u + x v < 1 (Rule 
4). In addition, a candidate vertex adjacent to all vertices in the current clique is forced into the 
clique (Rule 5) and a vertices of degree too low to contribute to the growing clique is rejected 
(Rule 7). The branch and bound search selects variables dynamically based on current degree 
in the candidate set: a non-greedy selection chooses a vertex of lowest degree and greedy selects 
highest degree. The computational results showed that greedy was good for (easy) sparse graphs 
and non-greedy good for (hard) dense graphs. 

1994: In 19 Pardalos and Xue reviewed algorithms for the enumeration problem (counting max- 
imal cliques) and exact algorithms for the maximum clique problem. Although dated, it continues 
to be an excellent review. 



1997: In [31] graph colouring and fractional colouring is used to bound search. Comparing again 
to MC (Listing [l.l[ ) the candidate set is coloured greedily and if the size of the current clique plus 
the number of colours used is less than or equal to the size of the largest clique found so far that 
branch of search is cut off. In |31j vertices are selected in non-increasing degree order, the opposite 
of that proposed by [TB]. We can get a similar effect to [3T] in MC if we allow free selection of 
vertices, colour newP between lines 42 and 43 and make the recursive call to expand in line 43 
conditional on the colour bound. 



2002: Patric R. J. Ostergard proposed an algorithm that has a dynamic programming flavour 
[T7] . The search process starts by finding the largest clique containing vertices drawn from the set 
S n — {v n } and records it size in c[n]. Search then proceeds to find the largest clique in the set 
Si = {vi, Wj+i, v n } using the value in c[i + 1] as a bound. The vertices are ordered at the top of 
search in colour order, i.e. the vertices are coloured g reed ily and then ordered in non-decreasing 



colour order, similar to that in number Sort Listing 1.3 Ostcrgard's algorithm is available as 
Clique:^ In the same year Torsten Fahle [5] presented a simple algorithm (Algorithm 1) that is 
essentially MC but with a free selection of vertices rather than the fixed iteration in line 36 of 
Listing [l.l | and dynamic maintenance of vertex degree in the candidate set. This is then enhanced 
(Algorithm 2) with forced accept and forced reject steps similar to Rules 4, 5 and 7 of [IB] and 
the algorithm is named DF (Domain Filtering). DF is then enhanced to incorporate a colouring 
bound, similar to that in Wood 1311. 



2003: Jean-Charles Regin proposed a constraint programming model for the maximum clique 
problem |20j . His model uses a matching in a duplicated graph to deliver a bound within search, 
a Not Set as used in the Bron Kerbosch enumeration Algorithm 457 [4] and vertex selection using 
the pivoting strategy similar to that in |4|2|28|7j . That same year Tomita reported MCQ [26] . 

2007: Tomita proposed MCR [25] and in the same year Janez Konc and Dusanka Janezic pro- 
posed the MaxCliqueDyn algorithm |12j^] The algorithm is essentially MCQ [26] with dynamic 
reordering of vertices in the candidate set, using current degree, prior to colouring. This reordering 
is expensive and takes place high up in the backtrack tree and is controlled by a parameter Tu m n. 
Varying this parameter influences the cost of the search process and Tn m n must be tuned on an 
instance- by-instance basis. 



Available from 
Available from 



http: / / users.tkk.fi/pat/cliquer.html/ 



http: / / www.sicmm.org/ ~konc/ 
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2010: Pablo San Segundo and Cristobal Tapia presented an early version of BBMC (BB-MCP) 
[2"5] and Tomita presented MCS [57]- In the same year Li and Quan proposed new max-SAT en- 
codings for maximum clique [14113] . 

2011: Pablo San Segundo proposed BBMC [22] and in [21] a version of BBMC with the colour 
repair steps from Tomita's MCS. 

2012: Renato Carmo and Alexandre P. Ziige [S] reported an empirical study of 8 algorithms 
including those from [6] and [8] along with MCQ, MCR (equivalent to MCQ3), MCS (equivalent 
to MCSbl) and MaxCliqueDyn. The claim is made that the Bron Kerbosch algorithm provides a 
unified framework for all the algorithms studied, although a Not Set is not used, neither do they use 
pivoting as described in |4|2|28|7j . Their algorithms can be viewed as an iterative (non- recursive) 
version of MC. Referring to algorithm MaxClique(G) in [5], the stack S holds pairs (Q,K) where 
Q is the currently growing clique and K is the candidate set. In line 4 a pair (Q,K) is popped 
from S. The while loop lines 5 to 8 selects and removes a vertex v from K (line 6), pushes the pair 
(Q,K) onto S (line 7), and then updates Q and K (line 8) such that Q has vertex v added to it and 
K becomes all vertices in K adjacent to v, i.e. K becomes the updated candidate set for updated 
Q. Therefore line 6 and 7 give a deferred iteration over the candidate set with v removed from K 
and v not added to Q. Therefore a pop in line 4 is equivalent to going once round the for loop in 
method expand in MC. All algorithms are coded in Python, therefore the study is objective (the 
authors include none of their own algorithms) and fair (all algorithms are coded by the authors 
and run in the same environment) and in that regard is both exceptional and laudable. 

4 The Computational Study 

The computational study attempts to answer the following questions. 

1. Where does the improvement in MCS come from? By comparing MCQ with MCSa we can 
measure the contribution due to static colouring and by comparing MCSa with MCSb we can 
measure the contribution due to colour repair. 

2. How much benefit can be had from the BitSet encoding? We compare MCSa with BBMC over 
a variety of problems. 

3. We have three possible initial orderings (styles). Is any one of them better than the others and 
is this algorithm independent? 

4. The candidate set is ordered in non-decreasing colour order. Could a tie-breaking rule influence 
performance? 

5. Most papers use only random problems and the DIMACS benchmarks. What other problems 
might we use in our investigation? 

6. Is it safe to recalibrate published results? 

Throughout our study we use a reference machine (named Cyprus), a machine with two Intel 
E5620 2.4GHz quad-core processors with 48 GB of memory, running linux centos 5.3 and Java 
version 1.6.CL07. 

4.1 MCQ versus MCS: static ordering and colour repair 

Is MCS faster than MCQ, and if so why? Just to recap, MCSa is MCQ with a static colour 
ordering set at the top of search and MCSb is MCSa with the colour repair mechanism. By 
comparing these algorithms we can determine if indeed MCSb is faster than MCQ and where that 
gain comes from: the static colouring order or the colour repair. We start our investigation with 
Erdos-Renyi random graphs G(n,p) where n is the number of vertices and each edge is included 
in the graph with probability p independent from every other edge. The code for generating these 
graphs is given in Appendix 2. 

The first experiments are on random G{n,p), first with n = 100, 0.40 < p < 0.99, p varying 
in steps of 0.01, sample size of 100, then with n — 150, 0.50 < p < 0.95, p varying in steps of 
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0.10, sample size of 100, and n = 200, 0.55 < p < 0.95, p varying in steps of 0.10, sample size 
of 100. Unless otherwise stated, all experiments are carried out on our reference machine. The 
algorithms MCQ, MCSa and MCSb all use style = 1 (i.e. MCQ1, MCSal, MCSbl). Figure [7] 
shows on the left average number of nodes against edge probability and on the right average run 
time in milliseconds against edge probability for MCQ1, MCSal and MCSbl. The top row has 
n = 100, middle row n = 150 and bottom row n = 20CQ As we apply the modifications to MCQ 
we see a reduction in nodes with MCSbl exploring less states than MCSal and MCSal less than 
MCQ1. But on the right we see that reduction in search space does not always result in a reduction 
in run time. MCSbl is always slower than MCSal, i.e. the colour repair is too expensive and when 
n = 100 MCSbl is often more expensive to run than MCQ! Therefore it appears that MCS gets 
its advantage just from the static colour ordering and that the colour repair slows it down. 

We also see a region where problems are hard for all our algorithms, at n = 100 and n = 150, 
both in terms of nodes and run time. However at n = 200 there is a different picture. We see a 
hard region in terms of nodes but an ever increasing run time. That is, even though nodes are 
falling cpu time is climbing. This agrees with the tabulated results in [53] (Tables 4 and 5 on 
page 580) for BB-MaxClique. It is a conjecture that run time increases because the cost of each 
node (call to expand) incurs more cost in the colouring of the relatively larger candidate set. In 
going from G(200, 0.90) to G(200, 0.95) maximum clique size increased on average from 41 to 62, 
a 50% increase, and for MCSal the average number of nodes fell by 20% (30% for MCSbl): search 
space has fallen, clique size has increased, this increases the cost of colouring and this results in 
an overall increase in run time. 

Figure [7] shows erratic behaviour on G(100,p) with 0.85 < p < 0.90. Why is this? Figure 
[8] shows on the left a plot of the number of edges in each of the 100 instances generated for 
G(100,p) with p £ {0.86, 0.87, 0.88}, one contour for each. G(100, 0.87) instances are very much a 
mix between G(100, 0.86) and G(100, 0.88) instances. On the right is a scatter plot of search effort 
against edge probability for MCSal on G(100,p), sample size 100. We see a large variation and it 
is this that gives us the erratic behaviour with < 0.85p < 0.90. Most likely, a larger sample size 
would smooth out the average. 

We now report on the 66 DIMACS instances [T] in Table [I] For each algorithm we have 3 
entries: the number of nodes, cpu time in seconds, and in brackets the size of the largest clique 
found. Each algorithm was allowed 14,400 cpu seconds, i.e. 4 hours, and if that was exceeded we 
have a table entry of " — ". The best cpu time in a row is in bold font, and when cpu time limit 
is exceeded the largest maximum clique size is emboldened. A time entry of corresponds to 
a run time of less than a second and we then consider the problem as being too easy to be of 
interest. Overall, we see that MCQ1 is rarely the best choice with MCSal or MCSbl performing 
better. There are 11 problems where MCSbl beats MCSal and 8 problems where MCSal beats 
MCSbl. Therefore the DIMACS benchmarks don't significantly separate the behaviour of these 
two algorithms. 

4.2 BBMC versus MCSa: a change of representation 

What advantage is to be had from the change of representation between MCSa and BBMC, i.e. 
representing sets as ArrayList in MCSa and as a BitSet in BBMC? MCSa and BBMC are at 
heart the same algorithm. They both produce the same colourings, order the candidate set in the 
same way and explore the same backtrack tree. The only difference is in the implementation of 
sets and how these are exploited. 

Figure [9] shows on the left run time of MCSal (x-axis) against run time of BBMC1 (y-axis) 
in milliseconds on each of the G(100,p) random instances. On the right we have the same but 
for G(200,p). The dotted line is the reference x = y. If points are below the line then BBMC1 is 
faster than MCSal. BBMC1 is typically twice as fast as MCSal. 

7 For MCQ1 the sample size at G(200, 0.95) was reduced to 28, i.e. the MCQ1-200 job was terminated 
after 60 hours. 
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Fig. 7. G(n,p), sample size 100. MCQ versus MCS, where's the win? On the left search effort in 
nodes visited (i.e. decisions made by the search process) and on the right run time in milliseconds. 
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Fig. 8. How many edges are there in G(100,p) with p 6 {0.86,0.87,0.88} and how does search 
cost vary? 




Fig. 9. Run time of MCSal against BBMC1, on the left (G100,p) and on the right G(200,p). 
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Table 1. DIMACS instances: MCQ versus MCS, nodes, run time and (clique size) 
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In Table [2] we tabulate Goldilocks instances from the DIMACS benchmark suite: we remove 
the instances that are too easy (take less than a second) and those that are too hard (take more 
than 4 hours) leaving those that are "just right" for both algorithms. Under each algorithm we 
have: nodes visited (and this is the same for both algorithms), run time in seconds and in brackets 
size of the maximum clique. The column on the far right is the ratio of MCSal's run time over 
BBMCl's run time, and a value greater than 1 shows that BBMC1 was faster by that amount. 
Again we see similar behaviour to that observed over the random problems: BBMC1 is typically 
twice as fast as MCSal. 
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Table 2. DIMACS Goldilocks instances: MCSal versus BBMC1 



4.3 MCQ and MCS: the effect of style 

What effect does the initial ordering of vertices have on performance? First, we investigate MCQ, 
MCSa and MCSb with our three orderings: style 1 being non-decreasing degree, style 2 a min- 
imum width ordering, style 3 non-decreasing degree tie-breaking on the accumulated degree of 
neighbours. At this stage we do not consider BBMC as it is just a BitSet encoding of MCSa. 
We use random problems G(n,p) with n equal to 100 and 150 with sample size of 100. This is 



shown graphically in Figure 10 on the left G(100,p) and on the right G(150,p) with average nodes 
visited plotted against edge probability. Plots on the first row are for MCQ, middle row MCSa and 
bottom MCSb. For MCQ style 3 is the winner and style 2 is worst, whereas in MCSa and MCSb 
style 2 is always best. Why is this? In MCQ the candidate set is ordered as the result of colouring 
and this order is then used in the next colouring. Therefore MCQ gradually disrupts the initial 
minimum width ordering but MCSa and MCSb do not (and neither does BBMC). The minimum 
width ordering (style 2) is best for MCSa, MCSb and BBMC. Note that MCQ3 is Tomita's MCR 
[2"S] and our experiments on G(n,p) show that MCR (MCQ3) beats MCQ (MCQ1). 

We now report on the 66 DIMACS instances [I], Tables [3] and |4) Table § gives run times in 
seconds. An entry of " — " corresponds to the cpu time limit of 14,400 seconds being exceeded 
and search terminating early and an entry of (zero) when run time is less than a second. For 
each algorithm we have three columns, one for each style: first column s% is style 1 with vertices 
in non-increasing degree order, S2 is style 2 with vertices in minimum width order, S3 is style 3 
with vertices in non-increasing degree order tie-breaking on sum of neighbouring degrees. Table [4] 
is the number of nodes, in thousands, for the experiments in Table [3j In Table [3] a bold entry is 
the best run time for that algorithm against the problem instance, and this is done only when run 
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Mean Nodes: MCQ on G( lOO.p) 



MCQ1 - 
MCQ2 
> MCQ3 - 




Mean Nodes: MCQ oil G( ] 50.p| 




0.65 0.7 0.75 
edge probability 




Mean Nodes: MCSb on G(100,p) 



MC"Sh I 
MCSb2 
4500 -MCSb3 




Mean Nodes: MCSb on G(150.p) 



MCSbl 4- 
MCSb2 
400000 -MCSb3 



0.5 0.55 0.6 0.65 0.7 0.75 
edge probability 




Fig. 10. The effect of style on MCQ, MCSa and MCSb. On the left G(100,p) and on the right 
G(150,p). Plotted is search effort in nodes against edge probability. The top two plots are for 
MCQ, middle plots MCSa and bottom MCSb. 
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Table 3. DIMACS instances: the effect of style on run time in seconds 
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times differ significantly. For MCQ there is no particular style that is a consistent winner. This is 
a surprise as MCQ3 is Tomita's MCR and in [25] it is claimed that MCR was faster than MCQ. 
The evidence that supports this claim is Table 2 of [33], 8 of the 66 DIMACS instances. For MCSa 
and BBMC style 2 is best more often than not, and in MCSb style 1 is best more often than not. 
Overall we see that the BBMC2 is our best algorithm, i.e. BBMC with a minimum width ordering. 



4.4 MCSa: tie breaking in colour classes 

It has often been reported that the order that vertices are chosen for expansion can have a profound 
effect on search effort. This is demonstrated in Figure [TT] where the algorithm MCO was applied to 
G(70,p) with a sample size of 100 using all three styles (MCI, MC2, MC3), using index order (MCi) 
and the reverse orderings (MC-1, MC-2, MC-3). Plotted is edge probability against logarithm of 
average run time. MC (and MCO) expand vertices in the candidate set from last to first therefore 
with a style of 1 vertices are expanded in non-decreasing degree order (smallest first). Figure 
11 shows that the order can have an enormous effect. Styles -1 and -3 (largest degree first) are 
orders of magnitude worse than styles 1 and 3 (smallest degree first). In fact the MC-1 and MC- 
3 experiments were abandoned after 72 hours with G(70,0.96) incomplete. This suggests that 
ordering vertices in colour classes may be worthwhile. 



Mean Time: G(70.p) 




0.7 

edge probability 



Fig. 11. MC applied to G(70,p) with different initial orderings. 



The number Sort method used by MCS delivers a colour-ordered candidate set. Vertices are 
picked out of colour classes and added to the candidate set in the order they were added to the 
colour class. If vertices are coloured in non-increasing degree order, vertices in a colour class will 
also be in non-increasing degree order. Consequently the candidate set will be in non-decreasing 
colour order tie-breaking on non-increasing degree. The expand method iterates over the candidate 
set from last to first and expands vertices of the same colour class in non-increasing degree order. 
What might happen if this was reversed so that we visit vertices from the same colour class in 
non-decreasing degree order? 

This was done by altering the for loop of line 59 in number Sort Listing [L~3| so that colour classes 
are processed in reverse order. Experiments were then run on G(100,p) and G(150,p) using MCSa 
with each of our three orderings. The effect was insignificant. Why might that be? Three individual 
problems were analysed from G(100, 30), G(100, 0.6) and G(100, 0.9). Within each call to expand 
the size of the candidate set and the number of colours used was logged. This is presented in Figure 
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64,412 


54,359 


64,131 


46,125 


44,880 


48,664 


64,412 


54,359 


64,131 



Table 4. DIMACS instances: the effect of style on search nodes in 1,000's 
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Fig. 12. How many vertices are in a colour class? We plot colours used against size of candidate 
set for MCSal and an instance of (7(100,0.9), G(100,0.6) and (7(100,0.3) 



12 What we see is that for G(100, 0.9) a candidate set of size of m typically required m/2 colours, 



therefore a colour class typically contained two vertices and there is little scope for tie-breaking 
to have an effect. When problems are sparse colour classes get larger (typically 4 to 6 vertices per 
colour class in G(100,0.6)) but they are easy and again tie-breaking makes little if any gain. 



4.5 More Benchmarks (not DIMACS) 

In [7j experiments are performed on counting maximal cliques in exceptionally large sparse graphs, 
such as the Pajek data sets (graphs with hundreds of thousands of vertice^J and SNAP data sets 
(graphs with vertices in the millions^) . Those graphs are out of the reach of the exact algorithms 
reported here. The initial reason for this is space consumption. To tackle such large sparse problems 
we require a change of representation, away from the adjacency matrix and towards the adjacency 
lists as used in [7] . Therefore we explore large random instances as in [22127] to further investigate 
ordering and the effect of the BitSet representation, the hard solvable instances in BHOSLIB to 
see how far we can go, and structured graphs produced via the SNAP (Stanford Network Analysis 
Project) graph generator. But first, we start with BHOSLIB. 

In Table |] we have the only instances from the BHOSLIB suite (Benchmarks with Hidden 
Optimum Solution^]) that could be solved in 4 hours. Each instance has a maximum clique of 
size 30. A bold entry is the best run time. For this suite we see that with respect to style there 
is no clear winner. 

Table [6] shows results on large random problems. Similar experiments are reported in Table 4 
and 5 of [22] and Table 2 in [27]. The first three columns are the nodes visited, and this is the 
same for MCSa and BBMC. Run times are then given in seconds for MCSa and BBMC using each 
of the tree styles. Highlighted in bold is the search of fewest nodes and this is style 2 (minimum 
width ordering) in all but one case. Comparing the run times we see that as problems get larger, 
involving more vertices, the relative speed difference between BBMC and MCSa diminishes and 
at n = 15,000 MCSa and BBMC's performances are substantially the same. 

8 Available from |http: / / v lado.fmf .uni-lj .si/pub /networks/da ta/ 

9 Available from http://snap.stanford.edu/data/index.html 

10 Available from Ihttp:// www. nisde.buaa.edu.cn/~kexu/benchmarks/graph-benchmarks. htm 
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instance 


n 


edges 


BBMC1 


BBMC2 


BBMC3 


frb30-15-l 
frb30-15-2 
frt>30-15-3 
frb30-15-4 
frb30-15-5 


450 
450 
450 
450 
450 


83,198 
83,151 
83,126 
83,194 
83,231 


292,095 3,099 
557,252 5,404 
167,116 1,707 
991,460 9,663 
282,763 2,845 


626,833 6,503 
599,543 6,136 
265,157 2,700 
861,391 8,513 
674,987 7,033 


361,949 3,951 
436,110 4,490 
118,495 1,309 
1,028,129 9,781 
281,152 2,802 



Table 5. BHOSLIB using BBMC: 1,000's of nodes and run time in seconds. Problems have 450 
vertices and graph density 0.82 



instance 




nodes 






MCSa 






BBMC 




n p 


si 


S2 


S3 


si 


S2 


S3 


si 


S2 


S3 


1,000 0.1 


4,536 


4,472 


4,563 




















0.2 


39,478 


38,250 


38,838 




















0.3 


400,018 


371,360 


404,948 


4 


4 


4 


2 


2 


2 


0.4 


3,936,761 


3,780,737 


4,052,677 


40 


39 


38 


26 


25 


26 


0.5 


79,603,712 75,555,478 


80,018,645 


860 


910 


859 


570 


574 


604 


3,000 0.1 


144,375 


142,719 


145,487 


3 


3 


3 


2 


2 


2 


0.2 


2,802,011 


2,723,443 


2,804,830 


38 


38 


38 


32 


32 


32 


0.3 


73,086,978 71,653,889 


73,354,584 


964 


960 


978 


926 


930 


931 


10,000 0.1 


5,351,591 


5,303,615 


5,432,812 


236 


252 


245 


212 


216 


214 


15,000 0.1 


22,077,212 


21,751,100 


21,694,036 


1,179 1,117 1,081 


1,249 1,235 1,208 



Table 6. Large random graphs, sample size 10 



The graphgen program was downloaded from the SNAP web site and modified to use a random 
seed so that generated graphs with the same parameters were actually different. This allows us 
to generate a variety of graphs, such as complete graphs, star graphs, 2D grid graphs, Erdos- 
Renyi random graphs with an exact number of edges, k-regular graphs (each vertex with degree 
k), Albert-Barbasi graphs, power law graphs, Klienberg copying model graphs and small- world 
graphs. Finding maximum cliques in a complete graph, star graph and 2D grid graph is trivial. 
Similarly, and surprisingly, small scale experiments suggested that Albert-Barbasi and Klienberg's 
graphs are also easy with respect to maximum clique. However k-regular and small world are a 
challenge. 

The SNAP graphgen program was used to generated k-regular graphs KR(n,k), i.e. random 
graphs with n vertices each with degree k. Graphs were generated with n = 200 and 50 < k < 160, 
with k varying in steps of 5, 20 instances at each point. BBMC1 and BBMC2 were then applied to 
each instance. Obviously, with style equal to 1 or 3, there is no heuristic information to be exploited 
at the top of search. But would a minimum width ordering, style 2, have an advantage? Figure 
[13| shows average search effort in nodes plotted against uniform degree k. We see that minimum 
width ordering does indeed have an advantage. What is also of interest is that KR(n, k) instances 
tend to be harder than their G(n,p) equivalents. For example, we can compare K 11(200, 160) with 
G(200,0.8) in Figure [7] MCSal took on average 1.9 million nodes for G(200,0.8) and BBMC1 
took on average 4.7 million nodes on the twenty KR(200, 160) instances. 

Small- world graphs SW(n, k,p) were then generated using graphgen. This takes three parame- 
ters: n the number of vertices, k where each vertex is connected to k nearest neighbours to the right 
in a ring topology (i.e. vertices start with uniform degree 2k), and p is a rewiring probability. This 
corresponds to the graphs in Figure 1 of [29]. Small- world graphs were generated with n — 1,000, 
50 < k < 100 in steps of 5, 0.0 < p < 0.25 in steps of 0.01, 10 graphs at each point. BBMC1 
was then applied to each instance to investigate how difficulty of finding a maximum clique varies 
with respect to k and p and also how size of maximum clique varies, i.e. this is an investigation 
of the problem. The results are shown as three dimensional plots in Figure [l4j on the left average 
search effort and on the right average maximum clique size. Looking at the graph on the left: 
when p — 0.0 problems are easy, as p increases and randomness is introduced problems quickly 
get hard, but as p continues to increase the graphs tend to become predominantly random and 
behave more like large sparse random graphs and get easier. We also see that as neigbourhood size 
k increases problems get harder. We can compare the SW(1000, 100, p) to the graphs G(1000, 0.2) 
in Table [6j G(1000,0.2) took on average 39,478 nodes whereas SW(1000, 100,0.01) took 709,347 
nodes, SW(1000, 100, 0.08) took 2,702,199 nodes and SW(1000, 100, 0.25) 354,430 nodes. Clearly 
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Fig. 13. k-regular SNAP instances KR(200, k), 130 < k < 160, sample size 20. 



small-world instances are relatively hard. Looking at the graph on the right (average maximum 
clique size) we see that as rewiring probability p increases maximum cliques size decreases and as 
k increases so too docs maximum clique size. 




Fig. 14. Small world graphs SW(200,k,p). On the left search effort and on the right maximum 
clique size. 



4.6 Calibration of results 

To compare computational results across publications authors compile and run a standard C 
program, dfmax, against a set of benchmarks. These run times are then used as a conversion 
factor, and results are then taken from one publication, scaled accordingly, and then included 
in another publication. Recent examples of this are [17] including rescaled results from [24] ; [20] 
including rescaled results from [T7], [31] and [5]; [35] including rescaled results from |T7] and [2"3] ; 
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22J including rescaled results from [TJ]; [3T] including rescaled results from [35]; [TJ including 
rescaled results from [35] and [5D]. Is this procedure safe? 

To test this we take two additional machines, Fais and Daleview, and calibrate them with 
respect to our reference machine Cyprus. We then run experiments on each machine using the Java 
implementations of the algorithms implemented here against some of the DIMACS benchmarks. 
These results are then rescaled. If the rescaling gives substantially different results from those on 
the reference machine this would suggest that this technique is not safe. 

Table [7] gives a "Rosetta Stone" for the three machines used in this study. The standard 
program dfmax [^] was compiled using gcc and the -02 compiler option on each machine and 
then run on the benchmarks r* on each machine. Run times in seconds are tabulated for the five 
benchmark instances, each machine's /proc/cpuinfo is given and a conversion factor relative to 
the reference machine Cyprus is then computed in the same manner as that reported in 22 "... 
the first two graphs from the benchmark were removed (user time was considered too small) and 
the rest of the times averaged ..." . Therefore when rescaling the run times from Fais we multiply 
actual run time by 0.41 and for Daleview by 0.50. 



machine 


rl00.5 


r200.5 


r300.5 


r400.5 


r500.5 


Intel(R) 


GHz 


cache 


Java 


scaling factor 


Cyprus 


0.0 


0.02 


0.24 


1.49 


5.58 


Xeon(R) E5620 


2.40 


12,288KB 


1.6.0_07 


1 


Fais 


0.0 


0.08 


0.58 


3.56 


13.56 


XEON(TM) CPU 


2.40 


512KB 


1.5.0-06 


0.41 


Daleview 


0.0 


0.09 


0.53 


3.00 


10.95 


Atom(TM) N280 


1.66 


512KB 


1.6.0-18 


0.50 



Table 7. Conversion factors using dfmax on three machines: Cyprus, Fais and Daleview 









MCSal 










BBMC1 






instance 




Fais 


Daleview 


Cyprus 




Fais 


Daleview 


Cyprus 


brock200-l 


0.25 


(19,343) 


0.27 


(17,486) 


1.00 


(4,777) 


0.15 


(15,365) 


0.09 


(25,048) 


1.00 


(2,358) 


brock200-4 


0.40 


(1,870) 


0.43 


(1,765) 


1.00 


(755) 


0.20 


(1,592) 


0.13 


(2,464) 


1.00 


(321) 


hamminglO-2 


0.18 


(1,885) 


0.14 


(2,299) 


1.00 


(333) 


0.25 


(608) 


0.21 


(710) 


1.00 


(151) 


hamming8-4 


0.24 


(1,885) 


0.28 


(1,647) 


1.00 


(455) 


0.23 


(1,625) 


0.19 


(1,925) 


1.00 


(367) 


johnsonl6-2-4 


0.35 


(2,327) 


0.38 


(2,173) 


1.00 


(823) 


0.26 


(1,896) 


0.14 


(3,560) 


1.00 


(495) 


MANN-a27 


0.21 


(32,281) 


0.22 


(31,874) 


1.00 


(6,912) 


0.14 


(12,335) 


0.10 


(16,491) 


1.00 


(1,676) 


p-hatl000-l 


0.25 


(8,431) 


0.28 


(7,413) 


1.00 


(2,108) 


0.14 


(8,359) 


0.12 


(9,389) 


1.00 


(1,169) 


p-hatl500-l 


0.19 


(77,759) 


0.22 


(66,113) 


1.00 


(14,421) 


0.11 


(90,417) 


0.10 


(92,210) 


1.00 


(9,516) 


p-hat300-3 


0.25 


(53,408) 


0.26 


(51,019) 


1.00 


(13,486) 


0.14 


(41,669) 


0.09 


(60,118) 


1.00 


(5,711) 


p-hat500-2 


0.27 


(13,400) 


0.30 


(12,091) 


1.00 


(3,659) 


0.14 


(10,177) 


0.11 


(13,410) 


1.00 


(1,428) 


p-hat 700-1 


0.40 


(1,615) 


0.51 


(1,251) 


1.00 


(641) 


0.29 


(1,169) 


0.24 


(1,422) 


1.00 


(344) 


sanlOOO 


0.11 


(94,107) 


0.12 


(89,330) 


1.00 


(10,460) 


0.10 


(57,868) 


0.11 


(54,816) 


1.00 


(5,927) 


san200-0.9-l 


0.29 


(4,918) 


0.31 


(4,705) 


1.00 


(1,444) 


0.18 


(4,201) 


0.11 


(6,588) 


1.00 


(748) 


san200-0.9-2 


0.22 


(23,510) 


0.25 


(20,867) 


1.00 


(5,240) 


0.15 


(14,572) 


0.09 


(23,592) 


1.00 


(2,218) 


san400-0.7-l 


0.25 


(10,230) 


0.27 


(9,607) 


1.00 


(2,573) 


0.15 


(8,314) 


0.12 


(10,206) 


1.00 


(1,260) 


san400-0.7-2 


0.23 


(84,247) 


0.27 


(72,926) 


1.00 


(19,565) 


0.13 


(71,360) 


0.11 


(87,325) 


1.00 


(9,219) 


san400-0.7-3 


0.24 


(45,552) 


0.27 


(40,792) 


1.00 


(10,839) 


0.13 


(39,840) 


0.11 


(46,818) 


1.00 


(5,162) 


sanr200-0.7 


0.31 


(5,043) 


0.33 


(4,676) 


1.00 


(1,548) 


0.19 


(4,079) 


0.12 


(6,652) 


1.00 


(795) 


sanr400-0.5 


0.28 


(9,898) 


0.31 


(8,754) 


1.00 


(2,745) 


0.16 


(9,177) 


0.12 


(12,658) 


1.00 


(1,484) 


ratio (total) 


0.21 (491,709) 


0.23 (446,788) 


1.00 (102,784) 


0.13 (394,623) 


0.11 (475,402) 


1.00 (50,349) 



Table 8. Calibration experiments using 3 machines, 2 algorithms and a subset of DIMACS bench- 
marks 



Table [8] shows the results of the calibration experiments. Tabulated are DIMACS benchmark 
instances that took more than 1 second and less than 2 minutes to solve using MCSal on our second 
slowest machine (Fais) . Run times are tabulated in milliseconds (in brackets) and the actual ratio 



1 Available from 



ftp: / /dimacs. rutgers.edu/pub/dsj /clique 
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of Cyprus-time over Fais-time (expected to be 0.41) is given as well as Cyprus-time over Daleview- 
time (expected to be 0.50) for each data point. Two algorithms are used, MCSal and BBMC1. 
The last row of Table [8] gives the relative performance ratios computed using the sum of the run 
times in the table. Referring back to Table[7|we expect a Cyprus/Fais ratio of 0.41 but empirically 
get 0.21 when using MCSal and 0.13 when using BBMC1, and expect a Cyprus/Daleview ratio 
of 0.50 but empirically get an average 0.23 with MCSal and 0.11 with BBMC1. The conversion 
factors in Table [7] consistently over-estimate the speed of Fais and Daleview. For example, we 
would expect MCSal applied to brock200-l on Fais to have a run time of 19, 343 x 0.41 = 7, 930 
milliseconds on Cyprus. In fact it takes 4,777 milliseconds. If we use the derived ratio in the last 
row of Table[8]we get 19, 343 x 0.21 = 4, 062 milliseconds, closer to actual performance on Cyprus. 
As another example consider sanlOOO using BBMC1 on Daleview. We would expect this to take 
54,816 x 0.50 = 27,408 milliseconds on Cyprus. In fact it takes 5,927 milliseconds! If we use the 
conversion ratio from the last row of Table[8]we get a more accurate estimate 54, 816 x 0. 11 = 6, 030 
milliseconds. 







Cliquer 






dfrnax 




instance 


Fais 


Daleview 


Cyprus 


Fais 


Daleview 


Cyprus 


brock200-l 


0.66 (9,760) 


0.43 (18,710) 


1.00 (6,490) 


0.39 (25,150) 


0.42 (23,020) 


1.00 (9,730) 


brock200-4 


0.64 (690) 


0.47 (1,190) 


1.00 (440) 


0.41 (1,510) 


0.46 (1,360) 


1.00 (620) 


p-hatl000-l 


0.62 (1,750) 


0.36 (3,020) 


1.00 (1,090) 


0.41 (1,680) 


0.45 (1,540) 


1.00 (690) 


p-hat 700-1 


0.67 (150) 


0.37 (270) 


1.00 (100) 








sanlOOO 


0.75 (120) 


0.30 (300) 


1.00 (90) 








san200-0.7-l 


0.48 (1,750) 


0.20 (4,220) 


1.00 (840) 








san200-0.9-2 


0.61 (18,850) 


0.21 (53,970) 


1.00 (11,530) 








san400-0.7-3 


0.62 (6,800) 


0.26 (16,100) 


1.00 (4,230) 








sanr200-0.7 


0.65 (2,940) 


0.36 (5,270) 


1.00 (1,900) 


0.40 (5,240) 


0.44 (4,770) 


1.00 (2,080) 


sanr400-0.5 


0.62 (1,490) 


0.38 (2,420) 


1.00 (930) 


0.41 (3,550) 


0.47 (3,080) 


1.00 (1,460) 


ratio (total) 


0.62 (44,300) 


0.26 (105,470) 


1.00 (27,640) 


0.39 (37,130) 


0.43 (33,770) 


1.00 (14,580) 



Table 9. Calibration experiments for Cliquer and dfmax using 3 machines. 



But maybe this is because we have used a C program (dfmax) to calibrate a Java program. 
Would we get a reliable calibration if a C program was used? Ostergard's Cliquer program was 
downloaded and compiled on our three machines and run against DIMACS benchmarks, i.e. the 
experiments in Table|8]were repeated using Cliquer and dfmax with a different, and easier, set of 



problems. The results are shown in Table i 2 What we see is an actual scaling factor of 0.62 for 
Cliquer on Fais when dfmax predicts 0.41 and for Cliquer on Daleview 0.26 when we expect 0.50; 
again we see that the rescaling procedure fails. The last three columns show a dfmax calibration 
using problems other than the r* benchmarks and here we see an error of about 5% on Fais 
(expected 0.41, actual 0.39) and about 16% on Daleview (expected 0.50, actual 0.43). Therefore it 
appears that rescaling results using dfmax and the five r* benchmarks is not a safe procedure and 
can result in wrong conclusions being drawn regarding the relative performance of algorithms. 



4.7 Relative algorithmic performance on different machines 

But is it even safe to draw conclusions on our algorithms when we base those conclusions on 
experiments performed on a single machine? Previously, in Table [2] we compared MCSa against 
BBMC on our reference machine Cyprus and concluded that BBMC was typically twice as fast as 
MCSa. Will that hold on Fais and on Daleview? Table Hoi takes the data from Table [8] and divides 
the run time of MCSa by BBMC for each instance on our three machines. On Fais BBMC is rarely 
more than 50% faster than MCSa and on Daleview BBMC is slower than MCSa more often than 
not! If experiments were performed only on Daleview using only the DIMACS instances we might 



An entry — was a run of dfmax that was terminated after 2 minutes. 
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draw entirely different conclusions and claim that BBMC is slower than MCSa. This change in 
relative algorithmic ordering has been observed on five different machines (four using the Java 
1.6.0) using all of the algorithms 



instcLiic6 


Fais 


Daleview 


Cyprus 


DrOCKZUU-1 


1 OA 

l.zO 


U. / u 


o riQ 
z.Uo 


DrOCKzUU-4 


1.1 / 


U. / z 


O Q£ 


hamminglO-2 


q i n 


Q 0/1 
O.Z4 


O 01 

Z.Z1 


h anini ing8- 4 


i i a 

1.10 


n qa 


1 O/l 
1.Z4 


johiisoii 16-2-4 


1.23 


0.61 


1.66 


MANN-a27 


2.62 


1.93 


4.12 


p-hatl000-l 


1.01 


0.79 


1.80 


p-hatl500-l 


0.86 


0.72 


1.52 


p-hat300-3 


1.28 


0.85 


2.36 


p-hat500-2 


1.32 


0.90 


2.56 


p-hat 700-1 


1.38 


0.88 


1.86 


sanlOOO 


1.63 


1.63 


1.76 


san200-0.9-l 


1.17 


0.71 


1.93 


san200-0.9-2 


1.61 


0.88 


2.36 


san400-0.7-l 


1.23 


0.94 


2.04 


san400-0.7-2 


1.18 


0.84 


2.12 


san400-0.7-3 


1.14 


0.87 


2.10 


sanr200-0.7 


1.24 


0.70 


1.95 


sanr400-0.5 


1.08 


0.69 


1.85 



Table 10. Calibration experiment part 2, does hardware affect relative algorithmic performance? 
Values greater than 1 imply BBMC is faster than MCSa, less than 1 MCSa is faster. 



5 Conclusion 

We have seen that small implementation details (in MC) can result in large changes in performance. 
Modern programming languages with rich constructs and large libraries of utilities makes it easier 
for the programmer to do this. We have also drifted away from the days when algorithms were 
presented along with their implementation code (examples here are [1] and [18] ) to presenting 
algorithms only in pseudo-code. Fortunately we are moving into a new era where code is being made 
publicly available (examples here are Ostergard's Cliquer and Konc and Janezic's MaxCliqueDyn). 
Hopefully this will grow and allow Computer Scientist to be better able to perform reproducible 
empirical studies. 

Tomita [37] presented MCS as an improvement on MCR brought about via two modifications: 
(1) a static colour ordering and (2) a colour repair step. Our study has shown that modification 
(1) improves performance and (2) degrades performance, i.e. MCSa is better than MCSb. 

BBMC is algorithm MCSa with sets represented as bit strings, i.e. BitSet is used rather than 
ArrayList. Experiments on the reference machine showed a speed up typically of a factor of 2. The 
three styles of ordering were investigated. The orderings were quickly disrupted by MCQ, but in 
the other algorithms minimum width ordering was best in random problems but in the DIMACS 
instances there was no clear winner. 

It was demonstrated in our basic algorithm MC (which does not use a colouring bound) that 
the order vertices were selected can have an enormous effect on search effort (and this is well 
known). The best order was to select vertices of low degree first, i.e. the worst-out heuristic in 
|19j . Incorporating this into MCSa as a tie breaker had negligible effect and the reason for this was 
because of the small size of colour classes in hard (dense) instances left little scope for tie-breaking. 



The -server and -client options were also tried. The -server option sometimes gave speedups of a factor 
of 2 sometimes a factor of 0.5, and this can also affect relative algorithmic performance. 
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New benchmark problems (i.e. problems rarely investigated by the maximum clique commu- 
nity) were investigated such as BHOSLIB, k-regular and small-world graphs. Motivation for this 
study was partly to compare algorithms but also to explore these problems to determine if and 
when they are hard. 

Finally we demonstrated that the standard procedure for calibrating machines and rescaling 
results is unsafe, and that running our own code on different machines can lead to different relative 
algorithmic performance. This is disturbing. First, it suggests that to perform a fair and reliable 
empirical study we should not rescale other's results: we must either code up the algorithms our- 
selves, as done here and also by Carmo and Ziige [5], or download and run code on our machines. 
And secondly, we should run our experiments on different machines. 



All the code used in this study is available at http://www.dcs.gla.ac.uk/~pat/maxClique 
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Appendix 1 

Listing 1 1 . 1 1 1 shows how we read in a graph in DIMACS clq format (lines 10 to 27, delivering an 
adjacency matrix A and an integer array of degrees degree), create one of our classes of styled 
algorithm (lines 32 to 44) and then to search for a maximum clique and print it out (lines 46 to 
52) along with run time statistics. An example of running from the command line is as follows: 

> java MaxClique BBMC1 brock200_l . clq 14400 

This will apply BBMC with style = 1 to the first brock200 DIMACS instance allowing 14400 
seconds of cpu time. 



Appendix 2 

Listing [l.!2| is our code for generating Erdos-Renyi random graphs G(n,p) where n is the number 
of vertices and each edge is included in the graph with probability p independent from every other 
edge. It produces a random graph in DIMACS format with vertices numbered 1 to n inclusive. It 
can be run from the command line as follows to produce a clq file 

> java RandomGraph 100 0.9 > 100-90-00 . clq 
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import java. util . * ; 
import java . io . * ; 

public class MaxCliquc { 

static int [] degree; 
static int [ ] [ ] A; 
static int n ; 



// degree of vertices 
// 0/1 adjacency matrix 



static void rcadDIMACS (String fnamc) throws IOExccption { 
String s — " " ; 

Scanner sc — new Scanner ( new File (fnamc ) ) ; 

while (sc . hasNcxt () && ! s , equals ("p" ) ) s — sc . next ( ) ; 

sc . next ( ) ; 

n — sc , nextlnt () : 

int m — sc. nextlnt (); 
degree — new int [ n ] ; 
A — new int [ n ] [ n ] ; 

while ( sc . hasNcxt ( ) ) { 

s — sc . next () ; // skip "edge" 

int i — sc. nextlnt () — 1; 

int j — sc. nextlnt () — 1; 

degrcc[i]+ + ; dcgrcc[j]+ + ; 

A[i][j] = A[j][i] = 1; 

} 

sc . close () ; 

} 

public static void main ( S t r i ng [ ] args ) throws IOExccption { 
readDIMACS( args [1]) ; 



MC mc = 


null ; 


















if ( 


args 


[0] . e 


quals ("MC" 


) 


mc 




new 


MC(n , A, degree ) ; 




else 


if 


( args 


[0] 


. equals 


;"mco" ) ) 


mc 




new 


MC0(n ,A, degree ) 




else 


if 


( args 


[0] 


. equals 


;"MCQ1" ) ) 


mc 




new 


MCQ(n ,A, degree , 


i) ; 


else 


if 


( args 


[0] 


. equals 


;"MCQ2" ) ) 


mc 




new 


MCQ(n ,A, degree ,2) ; 


else 


if 


( args 


[0] 


. equals 


;"MCQ3" ) ) 


mc 




new 


MCQ(n ,A, degree ,3) ; 


else 


if 


( args 


[0] 


. equals 


; " MCSal" ) 


) mc 




new 


MCSa(n ,A, degree 


,i) ; 


else 


if 


( args 


[0] 


. equals 


;"MCSa2" ) 


) mc 




new 


MCSa(n ,A, degree 


,2) ; 


else 


if 


( args 


[0] 


. equals 


;"MCSa3" ) 


) mc 




new 


MCSa(n ,A, degree 


,3) ; 


else 


if 


( args 


[0] 


. equals 


;"MCSbl" ) 


) mc 




new 


MCSb(n ,A, degree 


,1) ; 


else 


if 


( args 


[0] 


. equals 


; " MCSb2" ) 


) mc 




new 


MCSb(n ,A, degree 


,2) ; 


else 


if 


( args 


[0] 


. equals 


; " MCSb3" ) 


) mc 




new 


MCSb(n ,A, degree 


,3) ; 


else 


if 


( args 


[0] 


. equals 


;"BBMC1" ) 


) mc 




new 


BBMC(n ,A, degree 


,i) ; 


else 


if 


( args 


[0] 


. equals 


;"BBMC2" ) 


) mc 




new 


BBMC(n , A, degree 


,2) ; 


else 


if 


( args 


[0] 


. equals 


;"BBMC3" ) 


) mc 




new 


BBMC(n , A, degree 


,3) ; 


else 


return ; 



















System . gc () ; 

if (args. length > 2 ) mc .timeLimit — 1000 * (long) Integer, parselnt (args [2]); 
long cpuTime — System . c u r r en t T i me M i 1 1 i s () ; 
mc . search () ; 

cpuTime — System. currentTimcMillis() — cpuTime; 

System, out . println ( mc . maxS ize +" " + mc . nodes -j-" "+ cpuTime) ; 

for (int i — 0; i <mc . n ; i ++) if (mc . s o 1 u t i o n [ i ] = 1) System . out . print ( i+1 +" ") 
System . out . println () ; 



Listing 1.11. MaxCliquc 
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import java. util . * ; 

public class RandomGraph { 

public static void main(String [] args ) throws Exception { 
int n — Integer. parsclnt (args [0] ) ; 

double p — Double. parseDouble ( args [1]) ■ 
Random gen — new Random ( ) ; 

System . out . p r i nt In (" p edge "+ n + " 0"); 
for (int i=0;i<n-l;i++) 

for (int j=i+l;j<n; j ++) 

if (p >— gen . nextDouble ( ) ) 

System . out . println (" e "+ (i+1) +" "+ (j+1)): 

} 



Listing 1.12. RandomGraph 
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